The cells are genotyped using the Genome Analysis Toolkit (GATK, McKenna, Hanna et al, 2010) with a per cell variant calling approach.
Each cell is haplotyped in reference confidence mode (GVCF) to enable confidence estimates for an interval as being strictly homozygous (reference). This is followed by genotyping each cell individually to call variants independent of other cells. This ensures that we do not hit GATK’s upper limit of a maximum of 6 alternative alleles per genomic coordinate.
Genotyping parameters are optimized for better resolution of variants despite randomness created by CRISPR editing:
- The minimum base quality for variant calling is set to 10
- The max assembly region size for local reassembly is set to 300
- The assembly region padding is set to 100
- The heterozygosity value is set at 0.001