Run Report Metrics

  • Updated

The Tapestri Pipeline Run Report contains three tabs – Basics, Advanced, and Diagnostics. The below documentation explains each section, with the underlined text describing the metrics reported in each header. Some items only pertain to DNA + Protein runs. 

Basics tab

Sample

Sequencing (Protein)

Antibody counting

Panel uniformity plot

Sequencing (DNA)

Mapping

Cell calling

Variant calling

Advanced tab

Panel performance

Low-performing amplicons

Amplicons with >2* standard deviation Gini score

Amplicons with R1R2 imbalance

ADO calculation summary

Amplicon Gini score distribution plot

Diagnostics tab

R1R2 imbalance plot

Balance R1R2 barplot

Reads log-log plot

DNA 1x vs 10x coverage plot

DNA vs Protein reads scatter plot (protein only)

Antibody cell distribution plot (protein only)

Antibody read distribution plot (protein only)

Basics tab

Sample

Run ID

Analyte

DNA panel name

DNA panel size

Reference genome

Chemistry version

DNA pipeline version

Protein panel name (protein only)

Protein panel size (protein only)

Protein pipeline version(protein only)

Date analyzed

Sequencing (Protein)

# total read pairs

This is the number of total read pairs in the fastq file for protein.

Read quality (Q30)

This is the percentage of bases (in both R1 and R2 reads) that have a sequencing quality higher than 30.

% read pairs trimmed

This is the percentage of read pairs that were trimmed at the Cutadapt step (denominator: total read pairs).

% read pairs after cell barcode processing

This is the percentage of read pairs that passed the cell barcode extraction and detection step, i.e., they had a valid cell barcode structure (denominator: total read pairs).

% read pairs after antibody barcode processing

This is the percentage of read pairs that passed the antibody barcode detection step, i.e., they had a valid antibody barcode structure in them.

# total barcodes

This is the number of total barcodes that were found by the protein pipeline, i.e., the number of barcodes with more than 1 read.

# candidate barcodes

This is the number of barcodes with more than 10 reads.

% read pairs after candidate barcode filtering

This is the percentage of read pairs that are present in the candidate barcodes.

Antibody counting

Mean reads/cell/antibody

This is the mean reads per DNA called cell divided by the number of antibodies. This is always the floor of the decimal obtained by division.

# antibodies detected

This is the number of antibodies that have at least 1 read in at least 1 cell.

Median antibodies/cell

This is the median of the number of antibodies that have at least 1 read in a cell.

Panel uniformity plot

This is a boxplot where each box corresponds to an amplicon. The values for each box are the read percentages for that amplicon in all cells. The amplicons are sorted by their mean reads, with the highest at the top and lowest at the bottom. This plot shows how the performance of an amplicon varies throughout the called cells.

Sequencing (DNA)

# total read pairs

This is the number of total read pairs in the input fastq file. For multiple lanes, this would be the sum of all lanes.

Read quality (Q30)

This is the percentage of bases (in both R1 and R2 reads) that have a sequencing quality higher than 30.

% read pairs trimmed

This is the percentage of read pairs that were trimmed at the Cutadapt step (denominator: total read pairs).

% read pairs with valid barcodes

This is the percentage of read pairs that passed the collapse barcode step of the DNA pipeline, i.e., they had a valid barcode structure.

# total barcodes

This is the total number of barcodes identified in this sample, i.e., the number of barcodes that have 1 or more reads.

Mapping

% reads mapped to genome

This is the percentage of read pairs that successfully mapped to the genome.

% reads mapped to target

This is the percentage of read pairs that mapped to the insert coordinates of the amplicons in this panel. The mapped-to-insert coordinate is defined by having a greater than 1 base pair overlap between the read and insert coordinates.

Cell calling

# cells

This is the number of cells called by the cellfinder module.

Panel uniformity

This is the number of amplicons that have mean reads to the amplicon above 0.2 * the mean reads per amplicon per cell.

Mean reads/cell

This is the mean reads per called cell. It is the floor of the decimal obtained by dividing the total reads to cells by the number of cells.

Mean reads/cell/amplicon

This is the floor of the mean reads per cell divided by the number of amplicons in the panel.

% DNA read pairs assigned to cells

This is the percentage of read pairs that are present in the called cells.

DNA data completeness

This is defined as the percentage of amplicon per cell combinations that have more than 10 reads, i.e., they have enough sequencing depth for variant calling.

% antibody read pairs assigned to cells (protein only)

This is the percentage of protein read pairs that are assigned to DNA called cells (denominator: total read pairs in the fastq file).

Variant calling

# filtered variants

This is the number of variants that would be seen by loading the h5 file in Tapestri Insights and applying the default Insights filters.

ADO rate

This is the ADO rate for this run calculated using germline variants.

Advanced tab

Panel performance

% amplicons between 0.2*mean and 5*mean reads/cell/amplicon

This is the percentage of amplicons that lie in this range (denominator: panel size).

% amplicons between 0.5*mean and 2*mean reads/cell/amplicon

This is the percentage of amplicons that lie in this range (denominator: panel size).

% amplicons > 1x coverage

This is the percentage of amplicons that have mean reads over 1 (denominator: panel size).

% amplicons > 5x coverage

This is the percentage of amplicons that have mean reads over 5 (denominator: panel size).

% amplicons > 10x coverage

This is the percentage of amplicons that have mean reads over 10 (denominator: panel size).

% amplicons > 20x coverage

This is the percentage of amplicons that have mean reads over 20 (denominator: panel size).

% amplicons > 40x coverage

This is the percentage of amplicons that have mean reads over 40 (denominator: panel size).

% reads to amplicons above 2*mean reads/cell/amplicon

This is the percentage of reads that belong to amplicons with mean reads over 2 * the mean reads per cell per amplicon (denominator: total reads to cells).

Low-performing amplicons

This table shows amplicon names and mean reads per amplicon. This table is created for amplicons that have mean reads below 0.2 * the mean reads per cell per amplicon.

Amplicons with > 2* standard deviation Gini score

This is a list of amplicons that have a high Gini score. A high Gini score is based on the distribution of the Gini scores in this run and is defined as the mean Gini score of all amplicons + (2 * the standard deviation of Gini scores of all amplicons).

Amplicons with R1R2 imbalance

This is a list of amplicons that have an imbalance in their R1 and R2 reads. An amplicon is said to have an R1R2 imbalance if the fold change of R1 to R2 (or R2 to R1) is greater than 2.

ADO calculation summary

This table shows the germline variants that were used to calculate the ADO rate for this run.

Amplicon Gini score distribution plot

This distribution plot shows the distribution of the Gini scores of all amplicons in this run.

Diagnostics tab

R1R2 imbalance plot

This plot shows the R1 and R2 read fractions for the amplicons that have an R1R2 imbalance. When there are no such amplicons, this plot is empty.

Balance R1R2 barplot

This plot shows the R1 and R2 read fractions for the amplicons that do not have an R1R2 imbalance. This plot combined with the R1R2 imbalance plot should cover all the amplicons in the panel.

Reads log-log plot

This is the total reads vs rank-ordered barcodes log-log plot, which is also a part of the cellfinder output. Previously, a vertical line displayed in the log-log plot, but it is now obsolete because the cellfinder no longer works on a threshold. Instead, it selects cells based on complete cells instead of all cells above a certain read threshold. Its purpose is as a diagnostic tool primarily used to validate that the shape of the knee looks normal.

DNA 1x vs 10x coverage plot

This is a scatter plot where each dot is a barcode. The x value is a fraction of the amplicons that have more than 1 read in that barcode. The y value is the fraction of amplicons that have more than 10 reads in that barcode. The dots are colored based on if they are called a cell by cellfinder. This plot can be used to identify the runs with high amounts of partial cells. In such cases, the distribution of the barcodes would be closer to the unity line instead of being closer to the axes.

DNA vs Protein reads scatter plot (protein only)

This scatter plot for all barcodes shows the number of DNA reads that the barcode has on the x-axis and the number of protein reads that the barcode has on the y-axis.

Antibody cell distribution plot (protein only)

This bar graph shows the number of cells where an antibody has non-zero reads. Each bar represents an antibody.

Antibody read distribution plot (protein only)

This box plot shows the distribution of reads that an antibody has across all cells. Each box represents an antibody.

Share this article:

Was this article helpful?

2 out of 2 found this helpful

Have more questions? Submit a request