Cell Ranger6.0, printed on 11/23/2024
This tutorial introduces the cellranger multi pipeline for analyzing 3' Cell Multiplexing data in Cell Ranger 6.0. (It is recommended to complete the other tutorials in this series first).
In this tutorial, you will learn how to:
The dataset in this tutorial consists of two cell lines, Jurkat and Raji, multiplexed at equal proportions with one CMO per cell line, resulting in a pooled sample labeled with two CMOs. Libraries were prepared following the Chromium Next GEM Single Cell 3ʹ Reagent Kits v3.1 (Dual Index) with Feature Barcode technology.
Use wget to download the FASTQ data (about 44 GB):
wget https://cg.10xgenomics.com/samples/cell-exp/6.0.0/SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_Multiplex/SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_Multiplex_fastqs.tar
Download and untar the 2020 reference, if you have not already done so in the count tutorial:
wget https://cf.10xgenomics.com/supp/cell-exp/refdata-gex-GRCh38-2020-A.tar.gz tar -xf refdata-gex-GRCh38-2020-A.tar.gz
Untar the FASTQ files:
tar -xf fastqfiles.tgz
Navigate to the FASTQ files and observe their filenames. Note there is one directory that contains the FASTQ files for the GEX library and two that contain FASTQ files for the CMO libraries.
. ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L001_I1_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L001_I2_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L001_R1_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L001_R2_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L002_I1_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L002_I2_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L002_R1_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L002_R2_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L003_I1_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L003_I2_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L003_R1_001.fastq.gz │ └── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex_S2_L003_R2_001.fastq.gz ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L001_I1_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L001_I2_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L001_R1_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L001_R2_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L002_I1_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L002_I2_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L002_R1_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L002_R2_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L003_I1_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L003_I2_001.fastq.gz │ ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L003_R1_001.fastq.gz │ └── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture_S1_L003_R2_001.fastq.gz └── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_2_multiplexing_capture ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_2_multiplexing_capture_S1_L001_I1_001.fastq.gz ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_2_multiplexing_capture_S1_L001_I2_001.fastq.gz ├── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_2_multiplexing_capture_S1_L001_R1_001.fastq.gz └── SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_2_multiplexing_capture_S1_L001_R2_001.fastq.gz
The cellranger multi pipeline has two inputs:
In this tutorial, you only need to edit a few lines in a pre-made CSV using a text editor of your choice, in this example with nano:
nano SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K.csv
Copy and paste the code block below into your text editor. Note there are three sections, gene-expression, libraries, and samples. For a full list of the different sections, fields, and optional parameters, please see the cellranger multi page.
[gene-expression] ref,/path/to/refdata-gex-GRCh38-2020-A expect-cells,10000 [libraries] fastq_id,fastqs,lanes,physical_library_id,feature_types,subsample_rate SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex,/path/to/fastqs/SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K/SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_gex,any,Jurkat_Raji_10K_gex,gene expression, SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture,/path/to/fastqs/SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K/SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_1_multiplexing_capture,any,Jurkat_Raji_10K_multiplexing_capture,Multiplexing Capture, SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_2_multiplexing_capture,/path/to/fastqs/SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K/SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K_2_multiplexing_capture,any,Jurkat_Raji_10K_multiplexing_capture,Multiplexing Capture, [samples] sample_id,cmo_ids,description Jurkat,CMO301,Jurkat Raji,CMO302,Raji
Important: replace /path/to/ with the actual paths to the FASTQ files and reference before saving the CSV file.
To run cellranger multi, enter a command such as:
cellranger multi --id=Jurkat_Raji_10K \ --csv=SC3_v3_NextGem_DI_CellPlex_Jurkat_Raji_10K.csv
Cell Ranger 6.0 should start with a message like this:
Martian Runtime - v4.0.2-15-g22715e4 Serving UI at http://bespin1.fuzzplex.com:... Running preflight checks (please wait)...
Depending on your computational resources, it may take some time for the pipeline to complete. When it does, it should conclude with a message like this:
2021-02-13 15:24:18 [perform] Serializing pipestance performance data. Waiting 6 seconds for UI to do final refresh. Pipestance completed successfully! 2021-02-13 15:24:24 Shutting down. 2021-02-13 15:24:24 [jobmngr] Highest memory usage observed: { "rss": 517996544, "shared": 24014848, "vmem": 4006440960, "text": 35233792, "stack": 682795008, "proc_count": 54 }
Next, examine the output files.
cd Jurkat_Raji_10K/outs tree
The tree command will list 73 directories with 96 files. (If you are used to a cellranger count run, recall that multiplexing two samples necessitates doubling the per-sample outputs, and these numbers will grow correspondingly as more samples are multiplexed into a single GEM well). Additionally, there are some output files that are general to the entire experiment rather than a specific CMO.
The first section of the outputs contains the config.csv file, a duplicate of what was input. It also contains a count directory and a multiplexing_analysis directory:
└─ config.csv └─ multi ├── count │ ├── feature_reference.csv │ ├── raw_cloupe.cloupe │ ├── raw_feature_bc_matrix │ │ ├── barcodes.tsv.gz │ │ ├── features.tsv.gz │ │ └── matrix.mtx.gz │ ├── raw_feature_bc_matrix.h5 │ ├── raw_molecule_info.h5 │ ├── unassigned_alignments.bam │ └── unassigned_alignments.bam.bai └── multiplexing_analysis ├── assignment_confidence_table.csv ├── cells_per_tag.json ├── tag_calls_per_cell.csv └── tag_calls_summary.csv
For more information on these files, see Cell Multiplexing Outputs.
The next part of the outs directory contains the per_sample_outs, which contains two directories for Jurkat and Raji. For brevity, only the Jurkat outs are shown here.
In the Jurkat/count/analysis directory, the clustering directory contains CSV files with the results of graph-based clusters and K-means clustering from 2-10:
└─ clustering ├── graphclust │ └── clusters.csv ├── kmeans_10_clusters │ └── clusters.csv ├── kmeans_2_clusters │ └── clusters.csv ├── kmeans_3_clusters │ └── clusters.csv ├── kmeans_4_clusters │ └── clusters.csv ├── kmeans_5_clusters │ └── clusters.csv ├── kmeans_6_clusters │ └── clusters.csv ├── kmeans_7_clusters │ └── clusters.csv ├── kmeans_8_clusters │ └── clusters.csv └── kmeans_9_clusters └── clusters.csv
The diffexp directory likewise contains CSV files containing the results of differential expression analysis between the clusters reported above:
└─ diffexp ├── graphclust │ └── differential_expression.csv ├── kmeans_10_clusters │ └── differential_expression.csv ├── kmeans_2_clusters │ └── differential_expression.csv ├── kmeans_3_clusters │ └── differential_expression.csv ├── kmeans_4_clusters │ └── differential_expression.csv ├── kmeans_5_clusters │ └── differential_expression.csv ├── kmeans_6_clusters │ └── differential_expression.csv ├── kmeans_7_clusters │ └── differential_expression.csv ├── kmeans_8_clusters │ └── differential_expression.csv └── kmeans_9_clusters └── differential_expression.csv
The pca, tsne and umap directories contain CSV files for dimensionality reduction:
├── pca │ └── 10_components │ ├── components.csv │ ├── dispersion.csv │ ├── features_selected.csv │ ├── projection.csv │ └── variance.csv ├── tsne │ ├── 2_components │ │ └── projection.csv │ └── multiplexing_capture_2_components │ └── projection.csv └── umap ├── 2_components │ └── projection.csv └── multiplexing_capture_2_components └── projection.csv
The rest of the per_sample_outs are essentially the same as from cellranger count described elsewhere on this support site.
├── cloupe.cloupe │── feature_reference.csv │── sample_alignments.bam │── sample_alignments.bam.bai │── sample_barcodes.csv │── sample_feature_bc_matrix │ ├── barcodes.tsv.gz │ ├── features.tsv.gz │ └── matrix.mtx.gz │── sample_feature_bc_matrix.h5 │── sample_molecule_info.h5 │── metrics_summary.csv └── web_summary.html
Questions or feedback about this tutorial? Contact [email protected].