How are bi-allelic and multi-allelic variants represented in Mosaic?

  • Updated

The following steps are taken to represent bi-allelic variants in our data structures.

After genotype calling by GATK, check if there are any locations with 2 alternate alleles called across all cells. 

# Example where C1-C4 represent different cells; C1: T/T, C2: T/C, C3: C/C, C4: T/A
# REF ALT C1 C2 C3 C4 # T C,A 0/0 0/1 1/1 0/2

If a variant has 2 alternate alleles, a new line will be created for each alternate.

    #       REF ALT C1  C2  C3  C4
    #       T   C   0/0 0/1 1/1 ./0 
    #       T   A   0/0 ./0 ./. 0/1

The INFO for the new genotype won't be split. The DP will be the same as the original value for both created variants. 

Note that in Mosaic, ./0 will be interpreted as NGT = 0 (WT), ./1 as NGT = 2 (HET), ./. as NGT = 3 (Missing). In this example, cell 2 (genotype T/C) would show as NGT = 1 (HET) for the T>C variant and NGT = 0 (WT) for the T>A variant. Please contact support@missionbio.com for any questions regarding interpretation of multi allelic variants.  

 

The following steps are taken to represent multi-allelic variants in our data structures. 

After genotype calling by GATK, check if any cell has a 1/2 genotype, see Cell 5 in example below. 

# Example where C1-C6 represent different cells; C1: T/T, C2: T/C, C3: C/C, C4: T/A, C5: C/A, C6: A/A
# REF ALT C1 C2 C3 C4 C5 C6 # T C,A 0/0 0/1 1/1 0/2 1/2 2/2

If a cell has a 1/2 genotype, a new line will be created with a new genotype.

    #       REF ALT C1  C2  C3  C4  C5  C6
    #       T   C   0/0 0/1 1/1 ./0 ./1 ./.
    #       T   A   0/0 ./0 ./. 0/1 ./1 1/1
    #       *   C+A ./. ./. ./. ./. 0/1 ./.

The INFO for the new genotype won't be split. The DP will be the same as the original value for all created variants. For multi-allelic variants the reference is listed as '*' as it is not present in these cells. 

Share this article:

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request