Space Ranger1.2, printed on 11/13/2024
Space Ranger's pipelines analyze sequencing data produced from Visium Spatial Gene Expression. The analysis involves the following steps:
Run spaceranger mkfastq on the Illumina BCL output folder to generate FASTQ files.
Run spaceranger count on each Capture Area that was demultiplexed by spaceranger mkfastq. For Targeted Gene Expression libraries, see Targeted Gene Expression Analysis for instructions on how to provide the target gene panel information.
For the following example, assume that the Illumina BCL output is in a folder
named /sequencing/140101_D00123_0111_AHAWT7ADXX
.
First, follow the
instructions on running spaceranger mkfastq
to generate FASTQ files. For example, if the flowcell serial number was
HAWT7ADXX
, then spaceranger mkfastq will output FASTQ
files in HAWT7ADXX/outs/fastq_path
.
To generate spatial feature counts for a single library using automatic fiducial alignment and tissue detection, run spaceranger count with the following arguments. For a complete listing of the arguments accepted, see the Command Line Argument Reference below, or run spaceranger count --help.
For help on which arguments to use to target a particular set of FASTQs, consult Specifying Input FASTQ Files for 10x Pipelines. |
After determining these input arguments, run spaceranger:
$ cd /home/jdoe/runs $ spaceranger count --id=sample345 \ --transcriptome=/opt/refdata/GRCh38-3.0.0 \ --fastqs=/home/jdoe/runs/HAWT7ADXX/outs/fastq_path \ --sample=mysample \ --image=/home/jdoe/runs/images/sample345.tif \ --slide=V19J01-123 \ --area=A1 \ --localcores=8 \ --localmem=64
To generate spatial feature counts for a single library using a fiducial
alignment and tissue assignment json
file generated in Loupe Browser, run
spaceranger count with the following arguments.
After determining these input arguments, run spaceranger:
$ cd /home/jdoe/runs $ spaceranger count --id=sample345 \ --transcriptome=/opt/refdata/GRCh38-3.0.0 \ --fastqs=/home/jdoe/runs/HAWT7ADXX/outs/fastq_path \ --sample=mysample \ --image=/home/jdoe/runs/images/sample345.tif \ --slide=V19J01-123 \ --area=A1 \ --loupe-alignment=sample345.json \ --localcores=8 \ --localmem=64
Following a set of preflight checks to validate input arguments, spaceranger count pipeline stages will begin to run:
Martian Runtime - 4.0.2 Running preflight checks (please wait)... 2016-11-10 14:23:52 [runtime] (ready) ID.sample345.SPATIAL_RNA_COUNTER_CS.SPATIAL_RNA_COUNTER_PREP.SETUP_CHUNKS 2016-11-10 14:23:55 [runtime] (split_complete) ID.sample345.SPATIAL_RNA_COUNTER_CS.SPATIAL_RNA_COUNTER_PREP.SETUP_CHUNKS 2016-11-10 14:23:55 [runtime] (run:local) ID.sample345.SPATIAL_RNA_COUNTER_CS.SPATIAL_RNA_COUNTER_PREP.SETUP_CHUNKS.fork0.chnk0.main ...
By default, spaceranger will use all of the cores available on your
system to execute pipeline stages. You can specify a different number of cores
to use with the --localcores
option; for example,
--localcores=16
will limit spaceranger to using up to
sixteen cores at once. Similarly, --localmem
will restrict the
amount of memory (in GB) used by spaceranger.
The pipeline will create a new folder named with the sample ID you specified
(such as /home/jdoe/runs/sample345
) for its output. If this folder
already exists, spaceranger will assume it is an existing
pipestance and attempt to resume running it.
The spaceranger count
pipeline accepts slide serial and capture
area arguments, in order to use the most precise fiducial and spot coordinates
for an experiment. The easiest way to pass this information to spaceranger
count
is via the --slide
and --area
arguments.
When --slide
is specified, the pipeline will download the layout
file associated with the supplied serial number. If spaceranger
is
run in an environment without access to the outside Internet, follow the
instructions below in order to download a slide file locally.
If you do not know the serial number or capture area associated with the
experiment, you can still run spaceranger
via the
--unknown-slide
option. When specified, spaceranger
will use a default layout file for spot and fiducial coordinates. The typical
per-spot difference between the default layout and a specific slide is under 10
microns.
If the spaceranger
is to be run in an environment without access to
the Internet, the pipeline will require a Visium slide layout file via the
--slidefile
argument. You can download a layout file for a Visium
slide below. Enter the serial number of the slide (such as,
V19S01-123
) and press 'Download'. The layout file will start to
download.
A successful spaceranger count run concludes with a message similar to this:
2016-11-10 16:10:09 [runtime] (join_complete) ID.sample345.SPATIAL_RNA_COUNTER_CS.SPATIAL_RNA_COUNTER_CS.SUMMARIZE_REPORTS Outputs: - Run summary HTML: /opt/sample345/outs/web_summary.html - Outputs of spatial pipeline: /opt/sample345/outs/spatial - Run summary CSV: /opt/sample345/outs/metrics_summary.csv - BAM: /opt/sample345/outs/possorted_genome_bam.bam - BAM index: /opt/sample345/outs/possorted_genome_bam.bam.bai - Filtered feature-barcode matrices MEX: /opt/sample345/outs/filtered_feature_bc_matrix - Filtered feature-barcode matrices HDF5: /opt/sample345/outs/filtered_feature_bc_matrix.h5 - Unfiltered feature-barcode matrices MEX: /opt/sample345/outs/raw_feature_bc_matrix - Unfiltered feature-barcode matrices HDF5: /opt/sample345/outs/raw_feature_bc_matrix.h5 - Secondary analysis output CSV: /opt/sample345/outs/analysis - Per-molecule read information: /opt/sample345/outs/molecule_info.h5 - Loupe Browser file: /opt/sample345/outs/cloupe.cloupe Pipestance completed successfully!
The output of the pipeline is contained in a folder named with the sample ID you
specified (such as sample345
). The subfolder named outs
contains the main pipeline output files:
File Name | Description |
---|---|
web_summary.html | Run summary metrics and charts in HTML format |
spatial | Directory containing QC images for aligned fiducials and detetected tissue in jpg format, scalefactors_json.json, high and low resolution versions of the input image in png format, and tissue_positions_list.csv |
spatial/aligned_fiducials.jpg | Aligned fiducials QC image |
spatial/detected_tissue_image.jpg | Detected tissue QC image |
spatial/detected_tissue_image.png | Full resolution image downsampled to 2k pixels on the longest dimension |
spatial/detected_tissue_image.png | Full resolution image downsampled to 600 pixels on the longest dimension |
spatial/tissue_positions_list.csv | CSV containing spot barcode, if the spot was called under (1) or out (0) of tissue, the array position, image pixel position x, and image pixel postion y for the full resolution image |
spatial/scalefactors_json.json | Contains spot diameter estimation in pixels for the full resolution original image, tissue_hires_scalef which is the spot poisition multiplier in pixels for the high resolution image, fiducial spot diameter estimation in pixels for the full resolution original image, and tissue_hires_scalef which is the spot poisition multiplier in pixels for the low resolution image |
metrics_summary.csv | Run summary metrics in CSV format |
possorted_genome_bam.bam | Reads aligned to the genome and transcriptome annotated with barcode information |
possorted_genome_bam.bam.bai | Index for possorted_genome_bam.bam |
filtered_feature_bc_matrix | Filtered feature-barcode matrices containing only spot barcodes in MEX format |
filtered_feature_bc_matrix_h5.h5 | Filtered feature-barcode matrices containing only spot barcodes in HDF5 format |
raw_feature_bc_matrices | Unfiltered feature-barcode matrices containing all barcodes in MEX format |
raw_feature_bc_matrix_h5.h5 | Unfiltered feature-barcode matrices containing all barcodes in HDF5 format |
analysis | Secondary analysis data including dimensionality reduction, spot clustering, and differential expression |
molecule_info.h5 | Molecule-level information used by spaceranger aggr to aggregate samples into larger datasets |
cloupe.cloupe | Loupe Browser visualization and analysis file |
Once spaceranger count has successfully completed, you can browse the resulting summary HTML file in any supported web browser, open the .cloupe file in Loupe Browser, or refer to the Understanding Output section to explore the data by hand.
Argument | Description |
---|---|
--id | A unique run ID string such as sample345 . |
--fastqs | Either: Path of the fastq_path folder generated by spaceranger mkfastq such as /home/jdoe/runs/HAWT7ADXX/outs/fastq_path . This contains a directory hierarchy that spaceranger count will automatically traverse.- OR - Any folder containing fastq files, for example if the fastq files were generated by a service provider and delivered outside the context of the mkfastq output directory structure. Can take multiple comma-separated paths, which is helpful if the same library was sequenced on multiple flowcells. Doing this will treat all reads from the library, across flowcells, as one sample. |
--sample | Sample name as specified in the sample sheet supplied to spaceranger mkfastq.
Can take multiple comma-separated values, which is helpful if the same library was sequenced on multiple flowcells and the sample name used (and therefore fastq file prefix) is not identical between them. Doing this will treat all reads from the library, across flowcells, as one sample. Allowable characters in sample names are letters, numbers, hyphens, and underscores. |
--transcriptome | Path to the Space Ranger compatible transcriptome reference such as /opt/GRCh38-3.0.0 . |
--image | Brightfield tissue H&E image in .jpg or .tiff format. |
--darkimage | Multi-channel, dark-background fluorescence image as either a single, multi-layer .tiff file, multiple .tiff or .jpg files (provided by invoking the --darkimage parameter multiple times), or a pre-combined color .tiff or .jpg file. Details on image file constraints, encoding and formats are described in the Input Recommendations section. |
--colorizedimage | A color composite of one or more fluorescence image channels saved as a single-page, single-file color .tiff or .jpg. Please see the Input Recommendations section for information on input image file formats. |
--slide | Visium slide serial number. Required unless --unknown-slide is passed. |
--area | Visium capture area identifier. Required unless --unknown-slide is passed. Options for Visium are A1, B1, C1, D1. |
--slidefile | Slide layout file indicating capture spot and fiducial spot positions. |
--reorient-images | (optional) Use with automatic image alignment to specify that images may not be in canonical orientation with the hourglass in the top left corner of the image. The automatic fiducial alignment will attempt to align any rotation or mirroring of the image. |
--loupe-alignment | Alignment file produced by the manual Loupe alignment step. A --image must be supplied in this case. |
--unknown-slide | Set this if the slide serial number and area identifier are unknown. Setting this will cause Space Ranger to use default spot positions. Not compatible with --slide, --area, or --slidefile. |
--target-panel | Path to a Target Panel CSV file declaring the target panel used, if any. Required for Targeted Gene Expression analysis. See Targeted Gene Expression Analysis for details |
--rps-limit | (optional) Subsample to at most N mean reads under tissue per spot for targeted gene expression if N > 0, or disable subsampling if N = 0. Modifying this parameter is not recommended. The default value of N is 15,000 reads under tissue per spot for targeted gene expression. See Targeted Algorithms for details. |
--no-bam | (optional) Do not generate a BAM file. |
--nosecondary | (optional) Disable secondary analysis, such as dimensionality reduction, clustering and visualization. |
--r1-length | (optional) Hard-trim the input R1 sequence to this length. Note that the length includes the Barcode and UMI sequences so do not set this below 28. This and --r2-length are useful for determining the optimal read length for sequencing. |
--r2-length | (optional) Hard-trim the input R2 sequence to this length. |
--lanes | (optional) Lanes associated with this sample. |
--localcores | Restricts spaceranger to use specified number of cores to execute pipeline stages. By default, spaceranger will use all of the cores available on your system. |
--localmem | Restricts spaceranger to use specified amount of memory (in GB) to execute pipeline stages. By default, spaceranger will use 90% of the memory available on your system. |
--indices | (Deprecated. Optional. Only used for output from spaceranger demux) Sample indices associated with this sample. Comma-separated list of:
|