- Updated VCF Processing steps - The variant decomposition and normalization steps of the VCF processing workflow are updated to use bcftools command. BCFtools along with other improvements have improved the speed of the pipeline by ~50% as compared to previous version without affecting the results.
- Improved demultiplexing output - Minor tweaks were made to both genotype and antibody demultiplexing algorithms to improve the output quality. The Genotype Demultiplexing algorithm was updated to find additional germline variants based on annotations and filter out mis-assigned and low quality cells. Antibody hashing algorithm was enhanced to better process the hashing antibody signals and thereby increasing the cell recovery by a small margin.
- Various resource optimizations to reduce failure rate - This version improves both space and memory utilization for the runs.
- Space utilization of various steps was optimized to prevent failures, for example, VCF files were sorted and zipped, SAM files created during variant calling were zipped, removal of merged fastq files created for multi-lane runs and many other such fixes that allow the run to complete with minimal disk utilization.
- Runs are now processed with 256GB memory instances and reduce failures which were seen earlier.
- Multiple improvements in output files - Output files like the HTML report file, the QC log file etc were updated to provide user friendly metrics and messages. Additionally, loom file creation was disabled.
- Change in number of filtered variants in the report/h5 - Spanning deletions that are created by the variant caller as a result of upstream indel events were deleted from the VCF file. As there are two filters which are weighted by cell completeness, this update results in a change to the total number of filtered variants but it does not affect the true positives.
Tapestri DNA, DNA + Protein Pipeline v3.6 25 October 2024
- Updated
Share this article: