Skip to main content

(Optional) Customise the Configuration

The pipeline behaviour can be fine-tuned through a set of Nextflow configuration files in the config/ directory. This is especially useful in batch mode, where you want to set persistent defaults instead of repeating parameters on every run.


1. Singularity File System Mounting

Some users have encountered Singularity errors on HPC systems caused by file system mounting restrictions. To resolve this, bind the harmonisation resources directory explicitly in nextflow.config:

nextflow.config
singularity {
enabled = true
runOptions = "-B </absolute/path/to/harmonisation_resources>"
}

Thanks to Olivier Bakker for identifying this issue and providing the solution. See issue #124 for more context.


2. Pipeline Parameters (default_params.config)

Controls the core harmonisation parameters.

config/default_params.config
params {
to_build = '38'
chrom = ['1','2','3','4','5','6','7','8','9','10','11','12',
'13','14','15','16','17','18','19','20','21','22','X','Y','MT']
threshold = '0.99'
version = 'v1.1.10'
}
ParameterDescriptionDefault
to_buildTarget genome build — 38 (GRCh38) or 37 (GRCh37).38
chromChromosomes to process. Remove entries to restrict to specific chromosomes.All
thresholdMinimum proportion of variants that must map successfully to pass QC.0.99
versionVersion of the pipeline reference data.v1.1.10
info

Any parameter can be overridden at runtime on the command line:

nextflow run EBISPOT/gwas-sumstats-harmoniser --to_build 37 --threshold 0.95

3. Memory and Time (basic.config)

Memory and time limits for each pipeline step are defined in config/basic.config.

ModeMemory needed
Test mode~3 GB — suitable for local testing and debugging
Full run (all chromosomes)≥28 GB — recommend running on HPC or cloud
caution

Do not reduce memory settings when processing all chromosomes. Requirements are driven by reference file size, and insufficient memory will cause the pipeline to fail. Adjustments are only safe when running a subset of chromosomes.


4. Error Handling

By default, the pipeline uses ignore_error.config — failed processes are skipped after retries and the run continues. Pass --terminate_error to use exit_error.config and stop the pipeline on failure instead:

nextflow run EBISPOT/gwas-sumstats-harmoniser --terminate_error

The retry-triggering exit codes and maximum retries can be adjusted directly in the config files.

config/ignore_error.config
process {
errorStrategy = { task.exitStatus in ((130..145) + 104) ? 'retry' : 'ignore' }
maxRetries = 5
maxErrors = '-1'
}
config/exit_error.config
process {
errorStrategy = { task.exitStatus in ((130..145) + 104) ? 'retry' : 'terminate' }
maxRetries = 5
maxErrors = '-1'
}

For all available Nextflow configuration options, see the Nextflow documentation.