Cell Ranger2.0, printed on 11/23/2024
cellranger vdj pipeline produces V(D)J annotations on the assembled contigs and on the clonotype consensus sequences in multiple formats.
File type | Description
|-
CSV | High-level annotations with one contig, consensus, or clonotype per row. JSON | Detailed annotations, including alignment coordinates and amino acid translations. BED | Germline V(D)J segments as features, for use with a tool like IGV.
File | Description
|-
clonotypes.csv | High-level descriptions of each clonotype.
consensus_annotations.{csv,json} | High-level and detailed annotations of each clonotype consensus sequence.
filtered_contig_annotations.csv | High-level annotations of each high-confidence, cellular contig. This is a subset of all_contig_annotations.csv
.
all_contig_annotations.{csv,bed,json} | High-level and detailed annotations of each contig.
Column | Description |
---|---|
clonotype_id | The ID of the clonotype to which this consensus sequence was assigned. |
frequency | The observed number of cell-barcodes with this clonotype. |
proportion | The observed fraction of cell-barcodes with this clonotype. |
cdr3s_aa | A semicolon-delimited list of chain:sequence pairs, where "chain" is e.g., TRA or TRB and "sequence" is the CDR3 amino acid sequence for that chain. |
cdr3s_nt | A semicolon-delimited list of chain:sequence pairs, where "chain" is e.g., TRA or TRB and "sequence" is the CDR3 nucleotide sequence for that chain. |
name | description |
---|---|
barcode | Cell-barcode for this contig. |
is_cell | True/False value indicating whether the barcode was called as a cell. |
contig_id | Unique identifier for this contig. |
high_confidence | True/False value indicating whether the contig was called as high-confidence (unlikely to be a chimeric sequence or some other artifact). |
length | The contig sequence length in nucleotides. |
chain | The chain associated with this contig; e.g., "TRA" or "TRB". A value of "Multi" indicate that segments from multiple chains were present. |
v_gene | The highest-scoring V segment, e.g., TRAV1-1. |
d_gene | The highest-scoring D segment, e.g., TRBD1. |
j_gene | The highest-scoring J segment, e.g., TRAJ1-1. |
c_gene | The highest-scoring C segment, e.g., TRAC. |
full_length | The sequence spans the 5′ end of V to the 3′ end of J. |
productive | True/False/None value indicating whether the transcript is predicted to translate to a protein with a CDR3 region. "None" indicates that the contig does not span the 5′ end of a V region to the 3′ end of a J region, and so the produtivity of the transcript could not be determined. In addition to being V-J spanning, the sequence must have a detectable CDR3 region, have a start codon in the expected part of the V sequence, and have no stop codons in the V-J region. |
cdr3 | The predicted CDR3 amino acid sequence. |
cdr3_nt | The predicted CDR3 nucleotide sequence. |
reads | The number of reads aligned to this contig. |
umis | The number of distinct UMIs aligned to this contig. |
raw_clonotype_id | The ID of the clonotype to which this cell-barcode was assigned. |
raw_consensus_id | The ID of the consensus sequence to which this contig was assigned. |
Column | Description |
---|---|
clonotype_id | The ID of the clonotype to which this consensus sequence was assigned. |
consensus_id | The ID of this consensus sequence. |
The remaining columns are shared with those under the "Contig annotation CSV files" section.