Why can’t I find my target variant in Mosaic?

  • Updated

Use the following steps to help identify why a variant may not be found in your data

Confirm that target variant is covered by DNA panel

Review coverage quality

Review BAM files

 

Confirm that target variant is covered by DNA panel

Convert target variant into the appropriate genomic coordinate if needed

If the target variant is not available in hg19 (or hg38, other custom genome) coordinate (e.g., chr13:28592642), but is only available in codon format (e.g., c.2503G>T) or protein format (e.g., p.D835Y) you need to convert the variant into genomic coordinates.

  • Visit https://bioinformatics.mdanderson.org/transvar/
  • Select a task
    • Reverse Annotation: Protein for variants like p.I254N
    • Reverse Annotation: cDNA for variants like c.761T>A
  • Select a reference genome
    • GRCh37/hg19 for catalog panels
  • Select one or more annotation databases
    • UCSC
  • Enter variant in white box on the right side (follow required format)
    • Examples: FLT3:p.D835Y, TET2:c.3594+5G>A
  • Click ​Submit​ button

  • Review TransVar Results table and copy relevant information (in red)
    • In coordinates(gDNA/cDNA/protein) column the coordinate is listed chr13:g.​28592642​C>A.

Cross-reference target variant coordinate against panel file

  • Locate ​insert.bed​ file and the ​amplicon.bed​ file of the catalog or custom panel, these may be located in a subfolder

NOTE: The coordinates in the amplicon.bed file include the amplicon primer regions and insert regions. The coordinates in the insert.bed file excludes the amplicon primer regions and only contains the insert regions.

  • Open the ​amplicon.bed​ file using a text editor (i.e. ​Sublime Text)
  • Using the Finder function (command+F on MacOS) search for the amplicon that may harbor the target variant
  • Iteratively remove the last digit of the coordinate and search again until the coordinate overlaps partially with one value in the file

NOTE: If you remove 4 digits and still do not find a matching value, the coordinate is not covered by the panel

amplicon-1.jpg

  • If the target coordinate is detectable, ensure that the coordinate is covered by Read1 and/or Read2 (assuming data was generated using the recommended PE 150 sequencing chemistry):
    • Add 100 nt to the first-column coordinate to account for Read1 sequencing data → If the target coordinate is within the first 100 nt after the first-column coordinate, continue to ​next bullet point
    • Subtract 150 nt from the second-column coordinate to account for Read2 sequencing data  → If the target coordinate is within the first 150 nt before the second-column coordinate, continue to next bullet point
    • If the target coordinate is not covered by either read, it is located in the ‘gap’ region between both reads (gap region present in amplicons longer than 250 bp)+
  • Open the ​insert.bed​ file and find the same amplicon. Confirm that the target coordinate is covered and not located inside the primer region (the difference between the insert and the amplcion).
  • Covered → proceed to ​Step 2 - Review coverage quality​ below.
  • Not covered → If the target is located in the gap region, it may be recovered by performing 2 x 250 bp sequencing. If the target coordinate is located inside the amplicon primer sequence it cannot be recovered.  

NOTE: If your target coordinate resides in the first nucleotide of either read or within the last 5 - 10 nucleotides of either read, the quality of the data may be impacted. Contact support@missionbio.com for additional information.

Check if the variant is blacklisted

  • In the panel file root directory, open the systematic_variants.blacklist file. 
    • This file may not be present in custom panels
    • For White Glove panels, ensure you are checking the Analysis zip file for the blacklist
  • Ensure your variant is not listed here
    • If your variant is listed, it may be recovered by removing it from the blacklist and reprocessing the data on the pipeline with the modified panel file

Review coverage quality

Ensure that recommended average coverage is being met

  • On Tapestri Pipeline account assess the number of reads/cell/amplicon
    • If coverage is below 30x the sample is under-sequenced and may explain challenges to detect target variants

NOTE: Default filter settings may be inadequate if sample(s) are under/over-sequenced. 

Review target-specific amplicon variant coverage

  • Download the barcode.cell.coverage.tsv file from Tapestri Pipeline in the Output Files tab of the run.
  • Identify amplicon that harbors target variant
    • Amplicon name can be extracted from ​earlier steps in the ​amplicon.bed​ file
  • If “Mean Reads” is < 10 the default filter settings in likely discard the variant due to “Read Depth/DP” (default = 10) and/or due to “% of variants across cells/percent mutated cells” (default = 50)
  • Decreasing both filter threshold values in Mosaic may recover the target variant (e.g., “Read Depth” from 10 to 4, and “% of variants across cells”  from 50 % to 20%), or create a whitelist
    • If viewing only whitelist variants, all filters may be dropped to their lowest settings to attempt to recover more cells

Review BAM files

  • Download cells.bam and cells.bam.csi (or .bai) files from your Tapestri Pipeline account
  • Download and install the Integrated Genome Viewer (IGV) software
  • Load cells.bam file (File > Load from File) (indexed .BAI file will be loaded automatically)
  • Enter target variant in search field (e.g., chr13:7577520)
  • Wait for tracks to load and zoom out to evaluate Read1 and Read2 reads
  • Evaluate coverage at the desired location to determine if adequate

 

Share this article:

Was this article helpful?

7 out of 7 found this helpful

Have more questions? Submit a request