Use the following steps to help identify why a variant may not be found in your data
Confirm that target variant is covered by DNA panel
Convert target variant into the appropriate genomic coordinate if needed
If the target variant is not available in hg19 (or hg38, other custom genome) coordinate (e.g., chr13:28592642), but is only available in codon format (e.g., c.2503G>T) or protein format (e.g., p.D835Y) you need to convert the variant into genomic coordinates.
- Visit https://bioinformatics.mdanderson.org/transvar/
- Select a task
- Reverse Annotation: Protein for variants like p.I254N
- Reverse Annotation: cDNA for variants like c.761T>A
- Select a reference genome
- GRCh37/hg19 for catalog panels
- Select one or more annotation databases
- Enter variant in white box on the right side (follow required format)
- Examples: FLT3:p.D835Y, TET2:c.3594+5G>A
- Click Submit button
- Review TransVar Results table and copy relevant information (in red)
- In coordinates(gDNA/cDNA/protein) column the coordinate is listed chr13:g.28592642C>A.
Cross-reference target variant coordinate against panel file
- Locate insert.bed file and the amplicon.bed file of the catalog or custom panel, these may be located in a subfolder
NOTE: The coordinates in the amplicon.bed file include the amplicon primer regions and insert regions. The coordinates in the insert.bed file excludes the amplicon primer regions and only contains the insert regions.
- Open the amplicon.bed file using a text editor (i.e. Sublime Text)
- Using the Finder function (command+F on MacOS) search for the amplicon that may harbor the target variant
- Iteratively remove the last digit of the coordinate and search again until the coordinate overlaps partially with one value in the file
NOTE: If you remove 4 digits and still do not find a matching value, the coordinate is not covered by the panel
- If the target coordinate is detectable, ensure that the coordinate is covered by Read1 and/or Read2 (assuming data was generated using the recommended PE 150 sequencing chemistry):
- Add 100 nt to the first-column coordinate to account for Read1 sequencing data → If the target coordinate is within the first 100 nt after the first-column coordinate, continue to next bullet point
- Subtract 150 nt from the second-column coordinate to account for Read2 sequencing data → If the target coordinate is within the first 150 nt before the second-column coordinate, continue to next bullet point
- If the target coordinate is not covered by either read, it is located in the ‘gap’ region between both reads (gap region present in amplicons longer than 250 bp)+
- Open the insert.bed file and find the same amplicon. Confirm that the target coordinate is covered and not located inside the primer region (the difference between the insert and the amplcion).
- Covered → proceed to Step 2 - Review coverage quality below.
- Not covered → If the target is located in the gap region, it may be recovered by performing 2 x 250 bp sequencing. If the target coordinate is located inside the amplicon primer sequence it cannot be recovered.
NOTE: If your target coordinate resides in the first nucleotide of either read or within the last 5 - 10 nucleotides of either read, the quality of the data may be impacted. Contact email@example.com for additional information.
Check if the variant is blacklisted
In the panel file root directory, open the systematic_variants.blacklist file.
- This file may not be present in custom panels
- For White Glove panels, ensure you are checking the Analysis zip file for the blacklist
Ensure your variant is not listed here
- If your variant is listed, it may be recovered by removing it from the blacklist and reprocessing the data on the pipeline with the modified panel file
Review coverage quality
Ensure that recommended average coverage is being met
- On Tapestri Pipeline account assess the number of reads/cell/amplicon
- If coverage is below 30x the sample is under-sequenced and may explain challenges to detect target variants
NOTE: Tapestri Insights default filter settings may be inadequate if sample(s) are under/over-sequenced.
Review target-specific amplicon variant coverage
- Download the barcode.cell.coverage.tsv file from Tapestri Pipeline in the Output Files tab of the run.
- Identify amplicon that harbors target variant
- Amplicon name can be extracted from earlier steps in the amplicon.bed file
- If “Mean Reads” is < 10 the default filter settings in likely discard the variant due to “Read Depth/DP” (default = 10) and/or due to “% of variants across cells/percent mutated cells” (default = 50)
- Decreasing both filter threshold values in Tapestri Insights/Mosaic may recover the target variant (e.g., “Read Depth” from 10 to 4, and “% of variants across cells” from 50 % to 20%), or create a whitelist
- If viewing only whitelist variants, all filters may be dropped to their lowest settings to attempt to recover more cells
Review BAM files
- Download cells.bam and cells.bam.csi (or .bai) files from your Tapestri Pipeline account
- Download and install the Integrated Genome Viewer (IGV) software
- Load cells.bam file (File > Load from File) (indexed .BAI file will be loaded automatically)
- Enter target variant in search field (e.g., chr13:7577520)
- Wait for tracks to load and zoom out to evaluate Read1 and Read2 reads
- Evaluate coverage at the desired location to determine if adequate