Step 6 : Genotype calling

  • Updated

Pipeline v3

The cells are genotyped using the DRIVER binary from Sentieon® tools. DRIVER binary uses identical mathematics as Broad Institute’s BWA-GATK Best Practice Workflow, but is 20X faster BAM-to-VCF, measured in core-hours.

The cells are genotyped using the Haplotyper algorithm followed by the GVCFTyper algorithm for joint calling. A difference from the previous GATK implementation is that the cells are haplotyped in gvcf mode to emit a summarized confidence estimate for a site as being strictly homozygous (reference). The per-bp resolution is used while merging the genomic-VCFs (gVCFs) for all cells using Sentieon’s GVCFTyper algorithm. Loci found to be non-variant are maintained in the final output.

Genotyping parameters are optimized for high sensitivity:

  • A maximum of 2 alternate alleles are reported for each site, 
  • The minimum base quality for variant calling is set to 10, and 
  • The heterozygosity value is set at 0.001.

Pipeline v2

The cells are genotyped using the Genome Analysis Toolkit (McKenna, Hanna et al, 2010) with a joint calling approach that follows GATK Best Practices recommendations (DePristo, Banks et al. 2011; Van der Auwera, Carneiro et al, 2013). 

Each cell is haplotyped in reference confidence mode to enable per-base pair (bp) confidence estimates for a site as being strictly homozygous (reference). The per-bp resolution is maintained while merging the genomic-VCFs (gVCFs) for all cells using GATK’s CombineGVCFs tool. Finally, joint genotyping is performed for all cells using GATK’s GenotypeGVCFs tool. Loci found to be non-variant are maintained in the final output.

Genotyping parameters are optimized for high sensitivity:

  • A maximum of 2 alternate alleles are reported for each site, 
  • The minimum base quality for variant calling is set to 10, and 
  • The heterozygosity value is set at 0.001.
Share this article:

Was this article helpful?

1 out of 2 found this helpful

Have more questions? Submit a request