Mission Bio's Tapestri Pipeline allows customers to process single-cell DNA and DNA + Protein sequencing data generated on the Tapestri Platform.
Tapestri Pipeline: From FASTQ to variant calls
Key Features
- Support for Genotype and Antibody based demultiplexing
Table of Contents
Create a Tapestri Pipeline Account and Launch Tapestri Pipeline
Create an Account
If you are not an active Mission Bio customer, you must first create a Mission Bio Portal account. To create a Mission Bio account, enter your email address and a password with these requirements on Tapestri Portal. If you have problems creating an account, see this article.
Tapestri Pipeline is only available to customers. If the Launch button next to Tapestri Pipeline on Tapestri Portal is gray, click How do I get access?, and a dialog box displays that submits your request for access. Mission Bio will create a new Tapestri Pipeline account free of charge if you purchased the Tapestri Platform. Please contact sales@missionbio.com for purchasing information.
Once created, you will receive an email to set up your account.
If you do not receive a confirmation email, check your spam folder. If it is not there, contact support@missionbio.com.
Managing Users in Your Group
If you are part of a core lab or distributed organization, you can create customer accounts on Tapestri Portal and invite them to join. For more information, see this article.
Supported Browsers
Chrome v37+
Edge v80+
Firefox v41+
Opera
Safari v10.2+
Launch Tapestri Pipeline
Launch Tapestri Pipeline from Tapestri Portal.
Settings Menu
In the upper-right corner, click the settings menu.
Home
Clicking Home takes you to the Runs page.
My Account
Clicking My Account takes you to the Profile Details page where you can update your contact information and change your password.
Cloud Connector
Tapestri Pipeline allows importing files from Illumina BaseSpace and Amazon S3. To use either of these options, first set up the Cloud Connectors.
Illumina BaseSpace
To generate a BaseSpace access token, follow these instructions.
Amazon S3
To learn about how Tapestri Pipeline integrates with Amazon S3 buckets, read this FAQ.
Support
To view the User Guide, click the Support menu item.
Logout
To log out of Tapestri Pipeline, click the Logout menu item.
Dashboard
The dashboard is the navigation section at the top of the Runs and Files pages.
In-Progress, Failed, Completed
Depending on whether the Runs or Files page is displayed, the dashboard displays the total number of runs/files In-Progress, Failed, and Completed. The information contained in this section pertains to the previous 30 days.
Refresh, Add Files, Start Run
Refresh
Use the refresh button to update the status in the Runs or Files table, depending on the displayed page.
Add Files
To add files, click this button. For more information on adding FASTQ or panel, see this section.
For adding a custom reference file, follow the steps below.
- Click Other.
- Upload from Local Compter: Select *.fa.zip file from your local computer to upload a custom reference file.
- Import from Amazon S3: Provide Amazon S3 URI for the *.fa.zip file or the path to the folder containing this reference zip file.
- Click Upload.
Note: To prevent failures during reference file upload, ensure the following:
- The FASTA *.fa file is zipped at the file level and is named as *.fa.zip.
- Only .fa file is supported, .fasta file is not supported.
- Remove any empty lines from the .fa file prior to adding it to a zip file.
Start Run
To start a run, click this button. For more information, see this section.
Runs Page
Selecting the Runs page from the top-left corner displays a table with all the runs. It displays the following information:
- Run Name
- Pipeline
- Status
- Duration
- Start Date
- User
This additional information is displayed for DNA and DNA + Protein runs.
- Cell Count
- Panel Uniformity
- Mean Reads/Cell/Amplicon
- % DNA Read Pairs Assigned to Cells
- Mean Reads/Cell/Antibody (DNA + Protein only)
Search
To search for specific runs, type the criteria into the search box. You can search for:
- Run Names
- DNA + Protein
- Status
- User
Filter
On the runs page, click filter. The Runs Filter screen will appear.
There are three filters available on the Runs page – Members, Run Status, and Pipeline Type. You can apply one or all the filters to locate a run. The number of results displayed in the runs filter screen is dynamic and based on the filters selected.
- Members: To search for runs created by members in your group, select the email ID of that member from the dropdown menu or enter the email ID in the input field. Once done, click Apply Filter.
- Run Status: To search your runs based on their status, click on the status of your interest, and click Apply Filter. Currently available statuses are Completed, In - Progress, Importing Files, Canceled, and Failed.
Pipeline Type: To search for a run based on the Pipeline Type, select any of the available pipelines and click Apply Filter. Currently available Pipeline Types are DNA/Standard, DNA/Bulk, DNA/Genotype Demultiplexing, DNA + Protein/Standard, DNA + Protein/Genotype Demultiplexing, DNA + Protein/Antibody Demultiplexing, Merge Bulk Runs, and Merge Runs.
Sort
Sort the table by clicking any column name. It sorts in both ascending and descending order. Click it again to sort the other way.
Cancel Run
Click the icon to the left of the Run Name to Cancel Run. This only applies to in-progress runs.
Clone Run
Click the icon to the left of the Run Name to Clone Run. Only completed, failed or canceled DNA and DNA + Protein runs can be cloned. It will prompt you to Name the Run. After clicking Proceed, the parameters of the run can be changed.
Delete Run
Click the icon to the left of the Run Name of a completed, failed or canceled run to Delete it.
Click the icon to the left of the Run Name of an in-progress run to Cancel it first and then Delete it.
Rename Run
Runs cannot be renamed once submitted.
Run Details
Clicking the Run Name displays the Run Details page for that run. See this section for details.
Files Page
Clicking the Files option from the top-left corner displays a table with all the files in sections for FASTQ and Panel files.
The table displays the following information for each file type:
- File Name
- File Type
- Status
- Uploaded Date
- File Size
- Uploaded By
- Source
Search
To search for specific files, type the criteria into the search box. You can search for:
- File Name
- File Type
- Status
- Uploaded By
- Source
Filter
On the FIles page, click Filter. The Files Filter screen will appear.
There are five filters available on the Files page – Members, File Upload Status, File Type, Source, and File Size. You can apply one or all the filters to locate a file. The number of results displayed in the files filter screen is dynamic and based on the filters selected.
- Members: To search for files uploaded by members in your group, select the email ID of that member from the dropdown menu or enter the email ID in the input field. Once done, click Apply Filter.
- File Upload Status: To search for files based on their upload status, click the status of your interest and then click Apply Filter. Currently available statuses are Completed, Uploading, and Failed.
- File Type: To search for a run based on the file type, select any of the available options and click Apply Filter. FASTQ file types are either DNA FASTQ or Protein FASTQ. Panel file types are either DNA Panel, Protein Panel or Demultiplexing Protein panel.
- Source: To search for a file based on the file source, select any of the available options and click Apply Filter. FASTQ file sources are Illumina BaseSpace, Amazon S3, and Local Computer. Panel file sources are either Local Computer or Designer Catalog.
File Size: To search for a file based on the estimated file size, enter the minimum or maximum file size value in the input fields and select the respective size unit (B, KB, MB, GB) from the drop-down menu and click Apply Filter.
Sort
Sort the table by clicking any column name. It sorts in both ascending and descending order. Click it again to sort the other way.
Delete / Rename / Cancel File
Click the icon to the left of the File Name of a completed file to Delete or Rename the file.
Click the icon to the left of the File Name of an in-progress file to Cancel the upload. It will also cancel all other files that were uploaded at the same time.
Note: You cannot delete or rename the Catalog panels uploaded by Mission Bio – AML, Myeloid, CLL, THP, and TotalSeq™-D Heme Oncology Cocktail.
File Type
Click the File Type popup to change the file type.
Starting a Run
There are two ways to start a run – Start Run and Add Files then Start Run.
1. Start Run
Click the Start Run button in the top-right corner.
Name the Run
- Add a run name. Since this is used as the prefix for the output files, it cannot contain spaces or special characters, such as % & * @ ^ ; “, ' *.
- Add an optional description.
- Click Next.
Step 01 – Select Pipeline & Other Parameters
-
Select the Pipeline
Choose DNA, DNA + Protein, or Merge Samples in the dropdown menu.
DNA: This requires a DNA Panel file and an even number of DNA FASTQ files.
DNA + Protein: This requires a DNA Panel file and an even number of DNA FASTQ files and a Protein Panel file and an even number of Protein FASTQ files. The number of DNA and Protein FASTQ files must be the same.
Merge Samples: This requires at least two runs to merge. While the number of runs to merge is currently unlimited, Mission Bio recommends only merging 2 to 5 related runs (ie one patient across multiple time points, or different spatial sample from one patient). One .h5 file will be created. When merging DNA and DNA + Protein runs, only the DNA information is retained. Note: Ensure that the merged run name is unique.
Merge Bulk Runs: This requires at least two Bulk H5s to merge. It supports merging 2 to 6 files. -
Select the Run Mode Choose one of the modes given below in the dropdown
Standard - Regular DNA or DNA+Protein runs
Bulk - DNA runs processed with NGS bulk kit from Mission Bio.
Genotype Demultiplexing - DNA or DNA+Protein runs multiplexed based on sample genotypes
Antibody Demultiplexing - DNA+Protein runs multiplexed based on protein expression
NOTE: Run Mode is the new parameter to process Multiplexed Runs. It is important to set the correct mode as it determines the input for the run. Please read this article for more details.
-
Select the Reference Genome (DNA, DNA + Protein)
Choose either catalog Human (hg19), Human (hg38), Mouse or custom reference file you uploaded in the dropdown menu.NOTE: Please select the genome Human (hg19) when using the catalog panels - AML, Myeloid, CLL, THP or any of the other Mission Bio uploaded Designer Catalog panels.
-
Select Sample Variants File(Genotype Demultiplexing)
Choose either from Existing Pipeline Generated or User Uploaded Files, or Upload from Local Computer. -
Add Panel Files (DNA, DNA + Protein)
Choose either Select from Existing Files or Upload from Local Computer in the File Source dropdown menu. Click the Add button.
File Format
DNA Custom Panel File: A .zip file, Upload from Local Computer
DNA Designer Catalog Panel File: 4 Preloaded, Select From Existing Files
Protein Custom Panel File: A .csv file, Upload from Local Computer
Protein Designer Catalog Panel File: 1 Preloaded, Select From Existing Files
Protein Demultiplexing Panel File: A .csv file, Upload from Local Computer
Number of Files
DNA: Select or upload one DNA panel file.
DNA + Protein: Select or upload one DNA and one Protein Panel file.
DNA + Protein(Antibody Demultiplexing): Select or upload one DNA and one Demultiplexing Protein Panel file.
-
Select Runs To Merge (Merge Samples only)
Select which runs to merge.
-
Select Bulk H5s To Merge (Merge Bulk Runs only)
Select which Bulk H5s to merge.
- Click Next.
Step 02 – Select the FASTQ Files (DNA, DNA + Protein)
For DNA runs, select an even number of DNA FASTQ files.
For DNA + Protein runs, select an even number of DNA and Protein FASTQ files. The number of DNA and Protein FASTQ files must be equal.
File Format
Only files with these extensions are allowed: .fastq.gz; and .fq.gz. When importing files from Amazon S3 folders, all the files in the folder and nested folders with the correct extensions will import. All others will be ignored.
To use Amazon S3 or Illumina BaseSpace, first set up the Cloud Connector.
- In the dropdown, choose Select from Existing Files, Upload from Local Computer, Import from Amazon S3, or Import from Illumina BaseSpace in the Select FASTQ Files dropdown menu.
-
Click the Add button.
1. Select from Existing Files
Click the checkbox to the left of all the files you want to include.
Click Done.
2. Upload from Local Computer
DNA
Drag and drop the DNA FASTQ files into the upload area or click Choose Files.
DNA + Protein
Drag and drop the DNA FASTQ files into the upload area or click Choose Files.
Choose Protein FASTQ from the dropdown menu, and drag and drop the Protein FASTQ files into the upload area or click Choose Files.
Click Upload.
3. Amazon S3
To add multiple files or folders, separate the names with either a "," or ";".
DNA
Add the Amazon S3 URI for the DNA FASTQ files or the folder containing the FASTQ files.
DNA + Protein
Add the Amazon S3 URI for the DNA FASTQ files or the folder containing the FASTQ files.
Click +Add FASTQ files for another analyte.
Select Protein FASTQ.
Add the Amazon S3 URI for the Protein FASTQ files or the folder containing the FASTQ files.
Click Import.
4. Illumina BaseSpace
To add multiple biosamples, separate the names with either a "," or ";".
DNA
Add the Biosample Name for the DNA FASTQ files with this format – Project ID: Biosample name.
DNA + Protein
Add the Biosample Name for the DNA FASTQ files – Project ID: Biosample name.
Click +Add FASTQ files for another analyte.
Select Protein FASTQ.
Add the Biosample Name for the Protein FASTQ files – Project ID: Biosample name.
Click Import.
- Click the Next button.
Step 03 – Lane Assignment (DNA, DNA + Protein)
Tapestri Pipeline auto assigns FASTQ files to the lanes. If you prefer to assign the files yourself, click the Reset button in the left panel. The files will then go into the right panel where you can drag and drop them into the correct lane location. Each lane should have 2 files, one R1 and one R2. You can have as may lanes as you want, all R1s and R2s will be merged before starting the run.
Red and Green Highlighting
The green color indicates that the files follow the lane assignment rules we expect. For example, L001 indicates Lane 1. R1 denotes R1.
If the color is red, the file does not match our lane assignment rules. You can submit the run, but double-check to see that the files are in the correct location. Please review the lane assignment rules in the User Guide.
- Verify the file assignments for the DNA FASTQ files and Protein FASTQ files, when applicable.
- Click the Next button.
Step 04 – Preview
This is the final step before submitting the run. Confirm all the information. To edit any of the information, click the Edit icon next to the Steps or click the Step in the left Panel.
After confirming all the information, click Submit Run.
Run Successfully Submitted
Once the run is submitted, you have the option to Submit Another Run or Go To Runs Page.
Submit Another Run
This option clones the current run. Name the run and adjust any of the parameters.
Go To Runs Page
This option returns you to the main Runs page.
2. Add Files then Start Run
This option allows you to upload files before starting a run.
-
Click Add Files.
There are three ways to upload FASTQ files – Upload from Local Computer, Import from Amazon S3, and Import from Illumina BaseSpace. Panel files can only be uploaded from a local computer.
- For instructions on uploading files, refer to this section.
Click the Start Run button.
Follow these instructions to start the run.
Lane Assignment Rules
Tapestri Pipeline auto assigns files to lanes based on our naming convention. If the file names agree with our naming convention, then they are highlighted in green when assigned to lanes. If they do not conform, then they have a red background. This does not mean they are incorrect, however. For example, some people merge runs, which creates a Lane 5.
To determine lane assignment, we use the file name, so the file name should contain _R1_ and _R2_ for the algorithm to work. We parse the file name to determine the prefix. For the file names DNA-test_L001_R1_001.fastq.gz and DNA-test_L001_R2_001.fastq.gz, we consider “DNA-test” the prefix and then assign files to lane1 (L001).
Other rules:
- 1 pair of FASTQ files
Assigned to Lane1.
Example: DNA-test_L001_R1_001.fastq.gz, DNA-test_L001_R2_001.fastq.gz
- More than 1 pair of FASTQ files
- Same prefixes and extensions
If you have a single library sequenced on multiple lanes of the flow cell, you may have 2 to 4 pairs of files. These files typically have the same prefixes and same extensions with different lane numbers, like L001 and L002. These files will be assigned to different lanes.
- Different prefixes
If you have technical replicates, i.e., you have sequenced your sample twice to increase coverage, you will have 2 pairs of files with the same or different names. To analyze these samples, add them to 2 lanes on the first page. This scenario will be highlighted red, but ignore this.
- Same prefixes and extensions
Follows our convention for Lane Assignments
When the file names follow our convention, they have a green highlight. File names with Read Numbers (_R1_ and _R2_) and Lane Numbers (L001, L002, L003, and L004) abide by the rules. For example:
DNA-test_L001_R1_001.fastq.gz, DNA-test_L001_R2_001.fastq.gz
Protein-test_L001_R1_001.fastq.gz, Protein-test_L001_R2_001.fastq.gz
Does not follow the convention
These file names do not follow our convention, but you can proceed with some but not others. Each case below results in a red highlight. To rename a file, refer to this section.
- Scenarios that will cause the pipeline to fail
- No R1 or R2 read in the run
DNA-test_L001_R1_001.fastq.gz, DNA-test_L001_R1_001.fastq.gz - Files from different lanes added to the same lane
DNA-test_L001_R1_001.fastq.gz, DNA-test_L004_R2_001.fastq.gz
- No R1 or R2 read in the run
- Scenarios that will not cause the pipeline to fail, if you know the files are correct
- A mismatch in the lane number in the lane assignment and the lane number in the file name. In this case, ensure R1 and R2 have the same lanes.
Protein-Test_L005_R1_001.fastq.gz, Protein-test_L005_R2_001.fastq.gz - A mismatch in the prefixes. Confirm the files are correct before proceeding.
Test_L001_R1_001.fastq.gz, Protein-test_L001_R2_001.fastq.gz - Missing R1 and R2 in the filename. It is good practice to have these in the file name, but it will not cause the pipeline to fail. Confirm the lane assignment is correct.
Protein-test_L005_001.fastq.gz, Protein-test_L005_001.fastq.gz
- A mismatch in the lane number in the lane assignment and the lane number in the file name. In this case, ensure R1 and R2 have the same lanes.
Run Details Page
Clicking the name of a run displays the Pipeline Run Details page with the Run Report, Output Files, and Input Files.
Note: Canceled and Failed Runs only display partial information.
Run Report
For detailed information on the items contained in the Report, see this article. To download the Run Report, go to the Output Files tab and download the <fileprefix>.report.html file.
The multiplexed run creates two types of reports -
The Merge Samples runs do not generate a Run Report.
Summary
The Summary page displays the following information:
- # cells
- Panel uniformity
- Mean reads/cell/amplicon
- % DNA read pairs assigned to cells - Only V2
- % DNA reads mapped to target - Only V3
- Mean reads/cell/antibody
-
Sample information
- Run ID
- Analyte
- DNA panel name
- DNA panel size
- Reference genome
- Chemistry version - Only V2
- DNA pipeline version
- Protein panel name
- Protein panel size
- Protein pipeline version
- Tapestri Pipeline version - Only V3.4
- Date analyzed
- Protein QC metrics
- Read quality(Q30)
- GC content
- MaxN per read position
- Barcode constant 1 read rate
- Barcode constant 2 read rate
-
Sequencing (Protein)
- # total read pairs
- Read quality (Q30) - Only V2
- % read pairs trimmed - Only V2
- % read pairs after cell barcode processing - Only V2
- % read pairs after antibody barcode processing - Only V2
- # total barcodes - V2 & V3
- # candidate barcodes - Only V2
- % read pairs after candidate barcode filtering
-
Antibody counting
- Mean reads/cell/antibody
- # antibodies detected
- Median antibodies/cell
-
Panel uniformity plot
- Normalized read counts (x-axis) per Amplicon (y-axis)
- DNA QC metrics - Only V3
- Read quality(Q30)
- GC content
- MaxN per read position
- Barcode constant 1 read rate
- Barcode constant 2 read rate
-
Sequencing (DNA)
- # total read pairs
- Read quality (Q30) - Only V2
- % read pairs trimmed
- % read pairs with valid barcodes
- # total barcodes - V2 & V3
-
Mapping
- % reads mapped to genome
- % reads mapped to target
-
Cell calling
- # cells
- Panel uniformity
- Mean reads/cell
- Mean reads/cell/amplicon
- % DNA read pairs assigned to cells
- DNA data completeness
- Algorithm
- % antibody read pairs assigned to cells
-
Variant calling
- # filtered variants
- ADO rate
Advanced
The Advanced page displays the following information:
-
Panel performance
- % amplicons between 0.2*mean and 5*mean reads/cell/amplicon
- % amplicons between 0.5*mean and 2*mean reads/cell/amplicon
- % amplicons > 1x coverage - Only V2
- % amplicons > 5x coverage - Only V2
- % amplicons > 10x coverage
- % amplicons > 20x coverage - Only V2
- % amplicons > 40x coverage - Only V2
- % reads to amplicons above 2*mean reads/cell/amplicon
-
Low-performing amplicons - Only V2
- Amplicon name with mean read
-
Amplicon Performance Table - Only V3
- List of amplicons by name with the number of mean reads per cell and whether the ampicon passed the threshold.
-
Amplicons with > 2* standard deviation Gini score
- List of amplicons by name and Gini Score
-
Amplicons with R1R2 imbalance
- List of amplicons by name and the R1 and R2 read % for each
- ADO calculation summary
-
Amplicon Gini score distribution plot
- Gini score (x-axis) by Number of amplicons (y-axis) with mean and threshold
Diagnostics
The Diagnostics page displays the following plots:
-
R1R2 imbalance plot - Only V2
- x-axis: R1/R2 read percents
- y-axis: Amplicons
- Legend: R1 read percent, R2 read percent
-
Balance R1R2 barplot - Only V2
- x-axis: R1/R2 read percents
- y-axis: Amplicons
- Legend: R1 read percent, R2 read percent
-
Cellfinder UMAP plot - Only V3
- x-axis: UMAP-x
- y-axis: UMAP-y
- Legend: valid-cell, invalid-barcode, other
-
Cellfinder correlation coverage plot - Only V3
- x-axis: log10(reads/amplicon)
- y-axis: R2
- Legend: valid-cell, invalid-barcode, other
-
DNA vs Protein reads scatter plot (Protein only) - Only V2
- x-axis: DNA reads
- y-axis: Protein reads
- Legend: Cell, Not cell
-
Antibody cell distribution plot (Protein only)
- x-axis: Antibodies
- y-axis: Number of cells
-
Read log-log plot
- x-axis: Rank-ordered barcodes
- y-axis: Total reads
- Legend: DNA, Protein, Both
-
DNA 1x vs 10x coverage plot
- x-axis: Coverage at 1x
- y-axis Coverage at 10x
- Legend: Cell, Not cell, Unity line
-
Antibody read distribution plot (Protein only)
- x-axis: Antibodies
- y-axis: log10(1 + number of antibody reads)
Output Files
For more information about the Pipeline output files, refer to this article.
Sort
Sort the table by clicking any column name. It sorts in both ascending and descending order. Click it again to sort the other way.
Download
To download any output file, click the download icon to the left of the File Name.
Note: If the file does not download, see if you have an ad popup blocker running. If so, disable it, and download the file again.
Input Files
This page displays all the parameters and files used.
DNA and DNA + Protein
- Genome
- Panel Files
- FASTQ Files
- Lane Assignments
DNA and DNA + Protein with Genotype Multiplexing
- Genome
- Panel Files
- FASTQ Files
- Sample Variants File
- Lane Assignments
Merge Samples
- Input Runs
Merge Bulk Runs
- Input H5s and sample name