Run Readiness Checklist for Tractor Workflow
-
Input VCF: Make sure VCF is QC’d and retains only common variants.
-
File formats and indexing: Confirm all VCFs are bgzipped and indexed.
-
Split by chromosome: Make sure input files are split by chromosome (the workflow id designed to run per chromosome).
-
Reference build: Verify all inputs (VCFs, references, maps) use the same genome build (e.g., GRCh38 vs GRCh37).
- Chromosome nomenclature: Ensure chromosome numbering is consistent across all files (
chr1, chr2...
vs1, 2...
).- At least check within your input VCF, any references that are used, chunk files used for phasing and genetic map files used for LAI
-
Verify software setup: Confirm all required dependencies are installed and accessible in your PATH. See Installation Page
-
Config file: Double-check resources, conda environments, and file paths are correctly specified. Ensure the configuration matches the executor you plan to use (e.g. SLURM, PBS, SGE). Documentation on possible executors here
-
Disk space & runtime availability: Make sure you have enough memory, storage, and walltime for your dataset size, and appropriately configure config file. We offer some general insights on how to think about computational resource allocation for each step – check our FAQs
-
Mandatory parameters: Ensure all required workflow parameters are set and input files exist.
- Optional parameters: Only set optional parameters you need; remove extras to avoid errors (especially with Java in case of FLARE).
- Confirm addition of required optional parameters to the
nextflow run
command - If an optional paramter that is undefined is used in
nextflow run
command, it might lead to an error.
- Confirm addition of required optional parameters to the
- Genetic map files: Confirm you are using the right genetic map files for phasing and LAI.
- Different tools require diffrent formats
- SHAPEIT5’s genetic map files can be adapted to RFMix2 and GNomix format
- FLARE requires PLINK-format files. See here
- Run a test: Highly recommend running a test before scaling to the full dataset. You can download a test dataset here.