Cell Ranger5.0, printed on 12/25/2024
Cell Ranger (versions 4.0 and later) includes targeted-depth
, a lightweight program for summarizing a whole transcriptome analysis (WTA) dataset in the context of a hypothetical Targeted Gene Expression experiment. Given an existing WTA dataset and a target panel CSV file, targeted-depth
computes the fraction of reads mapped to targeted genes from the panel. This metric can help in two ways when performing a Targeted Gene Expression experiment based on the analyzed whole transcriptome library (or one from a similar sample):
targeted-depth
tool provides depth recommendations designed to take advantage of the efficiency enabled by targeting, while sequencing enough to match the sensitivity of the WTA dataset.The targeted-depth
tool requires two inputs:
target_panel
directory included with Cell Ranger or downloaded from the Panel Selection page. For more details on the use and structure of these files, see Count (Targeted GEX).The cellranger targeted-depth command can be invoked as follows:
cellranger targeted-depth \ --molecule-h5 sample345/outs/molecule_info.h5 \ --target-panel /opt/cellranger-5.0.0/target_panels/pan_cancer_v1.0_GRCh38-2020-A.target_panel.csv
Instructions can also be seen by displaying the help text: cellranger targeted-depth -h.
The top section of output from cellranger targeted-depth contains metrics relating to the WTA input sample:
Whole Transcriptome Analysis (WTA) Input Sample Metrics: --------------------------------------------------------- Estimated Number of Cells 7,742 Number of Reads 502,358,896 Mean Reads per Cell 64,887 Sequencing Saturation 85.2% Fraction of Reads from Targeted Genes 4.79% Number of Reads from Targeted Genes 24,045,376 Mean Reads per Cell from Targeted Genes 3,105
The first four metrics are Gene Expression metrics that do not involve the target gene panel, but are shown to give context to the other results.
The next three metrics quantify the sample's target gene content:
Number of Reads from Targeted Genes
divided by Estimated Number of Cells
. This number is the effective sequencing depth of the targeted portion of the whole transcriptome library.The final section of the output indicates recommended sequencing depths for a Targeted Gene Expression library enriched from the analyzed whole transcriptome library (or one from a similar sample):
Targeted GEX Recommended Sequencing Depths: --------------------------------------------------------- WTA Depth Mean Reads per Cell Total Reads --------------------------------------------------------- Original 6,211 48,090,752 20k rpc 1,914 14,822,813 50k rpc 4,786 37,057,032 The recommended Targeted Gene Expression sequencing depth is calculated as 2.0 * WTA Depth * Fraction of Reads from Targeted Genes. The 2.0 depth adjustment factor can help compensate for non-uniform read coverage and reads that cannot be mapped confidently to targeted genes. These are approximate estimates, and final results may vary.
The recommended depths in the two columns above are computed as follows, based on the targeted fraction and depth (Mean Reads per Cell) of the WTA sample:
Recommended Mean Reads per Cell = 2.0 * [WTA Depth] * [WTA Fraction of Reads from Targeted Genes] Recommended Total Reads = [Recommended Mean Reads per Cell] * [WTA Estimated Number of Cells]
The recommended Mean Reads per Cell is also equal to twice the Mean Reads per Cell from Targeted Genes
in the WTA sample. The depth adjustment factor of 2 is used to provide a conservative recommendation. For example, this WTA sample had 64,887 mean reads per cell (rpc) and 4.79% of reads from targeted genes, translating to a recommendation of 6,211 rpc for the Targeted Gene Expression library, or about 48 million reads if the number of recovered cells matches the WTA sample.
Recommendations are also given to approximately match the sensitivity of a whole transcriptome library sequenced to a depth of 20k rpc (the minimum recommended depth for whole transcriptome libraries) or 50k rpc.
Output can be saved to a file instead of output to the console by appending >
followed by a filename:
cellranger targeted-depth \ --molecule-h5 sample345/outs/molecule_info.h5 \ --target-panel /opt/cellranger-5.0.0/target_panels/pan_cancer_v1.0_GRCh38-2020-A.target_panel.csv \ > sample345_pan_cancer_depth.txt
Incompatible reference and target gene panel: The molecule info H5 file must be created using a reference genome compatible with the target gene panel. For the pre-designed gene panels, the required reference version is GRCh38 2020-A, which can be obtained from the Downloads page. An incompatible reference genome generates an error like this:
error: The gene ENSG00000286522 from the target panel csv is not present in the reference transcriptome used by the molecule info h5 file.