10x Genomics
Chromium Single Cell Multiome ATAC + Gene Exp.

Cell Ranger ARC2.0, printed on 03/03/2025

Specifying Input FASTQ Files for cellranger-arc count

The cellranger-arc count pipeline requires ATAC and GEX FASTQ files as input, which typically come from running cellranger-arc mkfastq, a 10x Genomics-aware convenience wrapper for bcl2fastq. However, it is possible to use FASTQ files from other sources, such as Illumina's bcl2fastq or BCL Convert, a published dataset, or bamtofastq. Input FASTQ files must conform to the naming conventions of bcl2fastq and mkfastq for cellranger-arc count to successfully complete. These files are specified using a libraries CSV file and passed to the cellranger-arc count pipeline using the --libraries argument.

The cellranger-arc count pipeline can process data from one Multiome ATAC library and one Multiome GEX library, each of which could be sequenced on multiple flow cells. Multi-library analysis is not possible at this time. cellranger-arc count must not be used to process GEX or ATAC data alone.

There are multiple ways bcl2fastq, bcl-convert> and mkfastq can be invoked, resulting in a wide range of potential file names and locations as output. Since finding the right FASTQ files to process and the right arguments to process those files as desired can be confusing, we will illustrate some common scenarios below.

FASTQs file naming convention

To serve as inputs for Cell Ranger ARC, FASTQ files should conform to the naming conventions of bcl2fastq and mkfastq described below.

GEX FASTQs

[Sample Name]_S1_L00[Lane Number]_[Read Type]_001.fastq.gz

Where Read Type is one of:

I1: Dual index i7 read (optional)
I2: Dual index i5 read (optional)
R1: Read 1
R2: Read 2

ATAC FASTQs

[Sample Name]_S1_L00[Lane Number]_[Read Type]_001.fastq.gz

Where Read Type is one of:

I1: Dual index i7 read (optional)
R1: Read 1
R2: Dual index i5 read
R3: Read 2

Cell Ranger ARC will also accept ATAC FASTQs in this format:

I1: Dual index i7 read (optional)
R1: Read 1
I2: Dual index i5 read
R2: Read 2

Common scenarios for GEX FASTQ files

Jump to ATAC FASTQ files

Where are your GEX FASTQ files?

In an output folder from cellranger-arc mkfastq or bcl2fastq (fastq_path) and:
- In a subdirectory next to Reports and Stats folders, with expected sample name prefixes.
- In the same directory as the Reports and Stats folders.
In a different folder:
- The files are named like "MySample_S1_L001_I1_001.fastq.gz". I don't see Reports or Stats anywhere.

How are your GEX FASTQ files named?

Consistent with bcl2fastq/mkfastq, e.g. "MySample_S1_L001_I1_001.fastq.gz"
Unlike any of the above examples.

Scenario: My GEX FASTQs are in an output folder from mkfastq or bcl2fastq, in a subdirectory next to Reports and Stats folders, with expected sample name prefixes

How did I get here?

By running cellranger-arc mkfastq with a simple CSV layout file or Illumina Experiment Manager samplesheet, or by running bcl2fastq directly (with an IEM samplesheet) on a flow cell.

If you ran mkfastq on the GEX flow cell

Your files will be in a (MKFASTQ_ID)/outs/fastq_path folder, and the file hierarchy may look similar to this:

MKFASTQ_ID
|-- MAKE_FASTQS_CS
`-- outs
    |-- fastq_path
        |-- HFLC5BBXX
            |-- test_sample1
            |   |-- test_sample1_S1_L001_I1_001.fastq.gz
            |   |-- test_sample1_S1_L001_I2_001.fastq.gz
            |   |-- test_sample1_S1_L001_R1_001.fastq.gz
            |   |-- test_sample1_S1_L001_R2_001.fastq.gz
            |   |-- test_sample1_S1_L002_I1_001.fastq.gz
            |   |-- test_sample1_S1_L002_I2_001.fastq.gz
            |   |-- test_sample1_S1_L002_R1_001.fastq.gz
            |   |-- test_sample1_S1_L002_R2_001.fastq.gz
            |   |-- test_sample1_S1_L003_I1_001.fastq.gz
            |   |-- test_sample1_S1_L003_I2_001.fastq.gz
            |   |-- test_sample1_S1_L003_R1_001.fastq.gz
            |   `-- test_sample1_S1_L003_R2_001.fastq.gz
            |-- test_sample2
            |   |-- test_sample2_S2_L001_I1_001.fastq.gz
            |   |-- test_sample2_S2_L001_I2_001.fastq.gz
            |   |-- test_sample2_S2_L001_R1_001.fastq.gz
            |   |-- test_sample2_S2_L001_R2_001.fastq.gz
            |   |-- test_sample2_S2_L002_I1_001.fastq.gz
            |   |-- test_sample2_S2_L002_I2_001.fastq.gz
            |   |-- test_sample2_S2_L002_R1_001.fastq.gz
            |   |-- test_sample2_S2_L002_R2_001.fastq.gz
            |   |-- test_sample2_S2_L003_I1_001.fastq.gz
            |   |-- test_sample2_S2_L003_I2_001.fastq.gz
            |   |-- test_sample2_S2_L003_R1_001.fastq.gz
            |   `-- test_sample2_S2_L003_R2_001.fastq.gz
        |-- Reports
        |-- Stats
        |-- Undetermined_S0_L001_I1_001.fastq.gz
        ...
        `-- Undetermined_S0_L003_R2_001.fastq.gz

If you ran bcl2fastq directly on the GEX flow cell

Your file hierarchy may look similar to this:

BCL2FASTQ_OUTPUT_DIR
|-- HFLC5BBXX
    |-- test_sample1
    |   |-- test_sample1_S1_L001_I1_001.fastq.gz
    |   |-- test_sample1_S1_L001_I2_001.fastq.gz
    |   |-- test_sample1_S1_L001_R1_001.fastq.gz
    |   |-- test_sample1_S1_L001_R2_001.fastq.gz
    |   |-- test_sample1_S1_L002_I1_001.fastq.gz
    |   |-- test_sample1_S1_L002_I2_001.fastq.gz
    |   |-- test_sample1_S1_L002_R1_001.fastq.gz
    |   |-- test_sample1_S1_L002_R2_001.fastq.gz
    |   |-- test_sample1_S1_L003_I1_001.fastq.gz
    |   |-- test_sample1_S1_L003_I2_001.fastq.gz
    |   |-- test_sample1_S1_L003_R1_001.fastq.gz
    |   `-- test_sample1_S1_L003_R2_001.fastq.gz
    |-- test_sample2
    |   |-- test_sample2_S2_L001_I1_001.fastq.gz
    |   |-- test_sample2_S2_L001_I2_001.fastq.gz
    |   |-- test_sample2_S2_L001_R1_001.fastq.gz
    |   |-- test_sample2_S2_L001_R2_001.fastq.gz
    |   |-- test_sample2_S2_L002_I1_001.fastq.gz
    |   |-- test_sample2_S2_L002_I2_001.fastq.gz
    |   |-- test_sample2_S2_L002_R1_001.fastq.gz
    |   |-- test_sample2_S2_L002_R2_001.fastq.gz
    |   |-- test_sample2_S2_L003_I1_001.fastq.gz
    |   |-- test_sample2_S2_L003_I2_001.fastq.gz
    |   |-- test_sample2_S2_L003_R1_001.fastq.gz
    |   `-- test_sample2_S2_L003_R2_001.fastq.gz
...

You will have one set of fastq files per sample, prefixed with the name of the sample as it appears in the simple CSV layout file or IEM samplesheet.

For more information on the naming conventions, please visit Illumina's support site or refer to the bcl2fastq User Guide. The scenario where your files do not conform to the naming convention is described in a different section later on this page.

The table below describes the line in the libraries CSV file you would use in the corresponding scenario. Be sure to substitute the capitalized text as appropriate. The "All Samples" entries in this table are provided for technical completeness.

Situation	Line in libraries CSV
All samples (mkfastq)	fastqs,sample,library_type /PATH/TO/MKFASTQ_ID/outs/fastq_path,,Gene Expression ...
All samples (mkfastq), multiple flow cells	fastqs,sample,library_type /PATH/TO/MKFASTQ_FLOWCELL1/outs/fastq_path,,Gene Expression /PATH/TO/MKFASTQ_FLOWCELL2/outs/fastq_path,,Gene Expression ...
All samples (bcl2fastq direct)	fastqs,sample,library_type /PATH/TO/BCL2FASTQ_OUTPUT_DIR,,Gene Expression ...
Process `test_sample1` (mkfastq)	fastqs,sample,library_type /PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample1,Gene Expression ...
Process `test_sample1` and `test_sample2` as a single merged sample (mkfastq)	fastqs,sample,library_type /PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample1,Gene Expression /PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample2,Gene Expression ...

Scenario: My GEX FASTQs are in an output folder from mkfastq or bcl2fastq, in the same directory as the Reports and Stats folders

How did I get here?

An Illumina Experiment Manager-formatted samplesheet was used with either no entry or a blank entry for the Sample_Project column. Your hierarchy may look similar to this:

fastq_path
|-- Reports
|-- Stats
|-- test_sample_S1_L001_I1_001.fastq.gz
|-- test_sample_S1_L001_I2_001.fastq.gz
|-- test_sample_S1_L001_R1_001.fastq.gz
|-- test_sample_S1_L001_R2_001.fastq.gz
|-- test_sample_S1_L002_I1_001.fastq.gz
|-- test_sample_S1_L002_I2_001.fastq.gz
|-- test_sample_S1_L002_R1_001.fastq.gz
|-- test_sample_S1_L002_R2_001.fastq.gz
|-- test_sample_S1_L003_I1_001.fastq.gz
|-- test_sample_S1_L003_I2_001.fastq.gz
|-- test_sample_S1_L003_R1_001.fastq.gz
|-- test_sample_S1_L003_R2_001.fastq.gz
|-- Undetermined_S0_L001_I1_001.fastq.gz
...
`-- Undetermined_S0_L003_R2_001.fastq.gz

This is fine; you would use the same arguments as if the FASTQs were organized into subfolders within the output folder.

Situation Line in libraries CSV

All samples (mkfastq)

fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,,Gene Expression
...

All samples (bcl2fastq direct)

fastqs,sample,library_type
/PATH/TO/BCL2FASTQ_OUTPUT_DIR,,Gene Expression
...

Process test_sample only (mkfastq)

fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample,Gene Expression
...

Scenario: The GEX FASTQs are named like "MySample_S1_L001_I1_001.fastq.gz". I don't see Reports or Stats anywhere

How did I get here?

It is likely that FASTQ files have been transferred from either a mkfastq or bcl2fastq run into another folder. They still retain the names assigned by bcl2fastq, which is a combination of sample name, sample order, lane, read type, and chunk. Your file hierarchy may look like this:

PROJECT_FOLDER
|-- MySample_S1_L001_I1_001.fastq.gz
|-- MySample_S1_L001_I2_001.fastq.gz
|-- MySample_S1_L001_R1_001.fastq.gz
|-- MySample_S1_L001_R2_001.fastq.gz
|-- MySample_S1_L002_I1_001.fastq.gz
|-- MySample_S1_L002_I2_001.fastq.gz
|-- MySample_S1_L002_R1_001.fastq.gz
|-- MySample_S1_L002_R2_001.fastq.gz

This is fine; since the files are named according to the bcl2fastq standard, you would use the same arguments as if the FASTQs were organized into a flow cell folder or mkfastq output folder.

Situation	Line in libraries CSV
All samples	fastqs,sample,library_type /PATH/TO/PROJECT_FOLDER,,Gene Expression ...
Process `MySample` only	fastqs,sample,library_type /PATH/TO/PROJECT_FOLDER,MySample,Gene Expression ...

My GEX FASTQs are not named like any of the above examples.

How did I get here?

It is likely that you received files that were processed through a proprietary LIMS system, which employs its own naming conventions.

10x Genomics pipelines require files to be named in the bcl2fastq convention in order to run properly. You will need to determine the corresponding sample and read type for each file, likely by consulting your sequencing core or the individual who demultiplexed your flow cell.

It is highly likely that these files were initially processed with bcl2fastq. Once you track the origin of the file, you will rename the files in the following format:

[Sample Name]_S1_L00[Lane Number]_[Read Type]_001.fastq.gz

Where Read Type is one of:

I1: Dual index i7 read (optional)
I2: Dual index i5 read (optional)
R1: Read 1
R2: Read 2

After the files have been renamed in the specified format, you will use the following arguments:

Situation	Line in libraries CSV
All samples	fastqs,sample,library_type /PATH/TO/PROJECT_FOLDER,,Gene Expression ...
Process `SAMPLENAME` only	fastqs,sample,library_type /PATH/TO/PROJECT_FOLDER,SAMPLENAME,Gene Expression ...

Common scenarios for ATAC FASTQ files

Jump to GEX FASTQ files

Where are your ATAC FASTQ files?

In an output folder from cellranger-arc mkfastq or bcl2fastq (fastq_path) and:
In a different folder:
- The files are named like "MySample_S1_L001_I1_001.fastq.gz". I don't see Reports or Stats anywhere.

How are your ATAC FASTQ files named?

Consistent with bcl2fastq/mkfastq, e.g. "MySample_S1_L001_I1_001.fastq.gz"
Unlike any of the above examples.

Scenario: My ATAC FASTQs are in an output folder from mkfastq or bcl2fastq, in a subdirectory next to Reports and Stats folders, with expected sample name prefixes

How did I get here?

By running cellranger-arc mkfastq with a simple CSV layout file or Illumina Experiment Manager samplesheet, or by running bcl2fastq directly (with an IEM samplesheet) on a flow cell.

If you ran mkfastq on the ATAC flow cell

Your files will be in a (MKFASTQ_ID)/outs/fastq_path folder, and your file hierarchy may look similar to this:

MKFASTQ_ID
|-- MAKE_FASTQS_CS
`-- outs
    |-- fastq_path
        |-- HFLC5BBXX
            |-- test_sample1
            |   |-- test_sample1_S1_L001_I1_001.fastq.gz
            |   |-- test_sample1_S1_L001_R1_001.fastq.gz
            |   |-- test_sample1_S1_L001_R2_001.fastq.gz
            |   |-- test_sample1_S1_L001_R3_001.fastq.gz
            |   |-- test_sample1_S1_L002_I1_001.fastq.gz
            |   |-- test_sample1_S1_L002_R1_001.fastq.gz
            |   |-- test_sample1_S1_L002_R2_001.fastq.gz
            |   |-- test_sample1_S1_L002_R3_001.fastq.gz
            |   |-- test_sample1_S1_L003_I1_001.fastq.gz
            |   |-- test_sample1_S1_L003_R1_001.fastq.gz
            |   |-- test_sample1_S1_L003_R2_001.fastq.gz
            |   `-- test_sample1_S1_L003_R3_001.fastq.gz
            |-- test_sample2
            |   |-- test_sample2_S1_L001_I1_001.fastq.gz
            |   |-- test_sample2_S1_L001_R1_001.fastq.gz
            |   |-- test_sample2_S1_L001_R2_001.fastq.gz
            |   |-- test_sample2_S1_L001_R3_001.fastq.gz
            |   |-- test_sample2_S1_L002_I1_001.fastq.gz
            |   |-- test_sample2_S1_L002_R1_001.fastq.gz
            |   |-- test_sample2_S1_L002_R2_001.fastq.gz
            |   |-- test_sample2_S1_L002_R3_001.fastq.gz
            |   |-- test_sample2_S1_L003_I1_001.fastq.gz
            |   |-- test_sample2_S1_L003_R1_001.fastq.gz
            |   |-- test_sample2_S1_L003_R2_001.fastq.gz
            |   `-- test_sample2_S1_L003_R3_001.fastq.gz
        |-- Reports
        |-- Stats
        |-- Undetermined_S0_L001_I1_001.fastq.gz
        ...
        `-- Undetermined_S0_L003_R3_001.fastq.gz

If you ran bcl2fastq directly on the ATAC flow cell

Your file hierarchy may look similar to this:

BCL2FASTQ_OUTPUT_DIR
|-- HFLC5BBXX
    |-- test_sample1
    |   |-- test_sample1_S1_L001_I1_001.fastq.gz
    |   |-- test_sample1_S1_L001_R1_001.fastq.gz
    |   |-- test_sample1_S1_L001_R2_001.fastq.gz
    |   |-- test_sample1_S1_L001_R3_001.fastq.gz
    |   |-- test_sample1_S1_L002_I1_001.fastq.gz
    |   |-- test_sample1_S1_L002_R1_001.fastq.gz
    |   |-- test_sample1_S1_L002_R2_001.fastq.gz
    |   |-- test_sample1_S1_L002_R3_001.fastq.gz
    |   |-- test_sample1_S1_L003_I1_001.fastq.gz
    |   |-- test_sample1_S1_L003_R1_001.fastq.gz
    |   |-- test_sample1_S1_L003_R2_001.fastq.gz
    |   `-- test_sample1_S1_L003_R3_001.fastq.gz
    |-- test_sample2
    |   |-- test_sample2_S1_L001_I1_001.fastq.gz
    |   |-- test_sample2_S1_L001_R1_001.fastq.gz
    |   |-- test_sample2_S1_L001_R2_001.fastq.gz
    |   |-- test_sample2_S1_L001_R3_001.fastq.gz
    |   |-- test_sample2_S1_L002_I1_001.fastq.gz
    |   |-- test_sample2_S1_L002_R1_001.fastq.gz
    |   |-- test_sample2_S1_L002_R2_001.fastq.gz
    |   |-- test_sample2_S1_L002_R3_001.fastq.gz
    |   |-- test_sample2_S1_L003_I1_001.fastq.gz
    |   |-- test_sample2_S1_L003_R1_001.fastq.gz
    |   |-- test_sample2_S1_L003_R2_001.fastq.gz
    |   `-- test_sample2_S1_L003_R3_001.fastq.gz
...

You will have one set of fastq files per sample, prefixed with the name of the sample as it appears in the simple CSV layout file or IEM samplesheet. Other situations described later on this page deal with the presence of four separate sets of files (four "samples" from bcl2fastq's point of view) per single biological sample/library.

Situation	Line in libraries CSV
All samples (mkfastq)	fastqs,sample,library_type /PATH/TO/MKFASTQ_ID/outs/fastq_path,,Chromatin Accessibility ...
All samples (mkfastq), multiple flow cells	fastqs,sample,library_type /PATH/TO/MKFASTQ_FLOWCELL1/outs/fastq_path,,Chromatin Accessibility /PATH/TO/MKFASTQ_FLOWCELL2/outs/fastq_path,,Chromatin Accessibility ...
All samples (bcl2fastq direct)	fastqs,sample,library_type /PATH/TO/BCL2FASTQ_OUTPUT_DIR,,Chromatin Accessibility ...
Process `test_sample1` (mkfastq)	fastqs,sample,library_type /PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample1,Chromatin Accessibility ...
Process `test_sample1` and `test_sample2` as a single merged sample (mkfastq)	fastqs,sample,library_type /PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample1,Chromatin Accessibility /PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample2,Chromatin Accessibility ...

Scenario: My ATAC FASTQs are in an output folder from mkfastq or bcl2fastq, but there are multiple folders per sample index, like "SI-GA-A1_1" and "SI-GA-A1_2"

How did I get here?

It is likely that the input samplesheet used explicitly separated the four oligos in a 10x Genomics sample index set into four separate sample names. You may see a file hierarchy similar to this:

bcl2fastq_output
|-- HFLC5BBXX
    |-- SI-GA-A1_1
    |   |-- SI-GA-A1_1_S1_L001_I1_001.fastq.gz
    |   |-- SI-GA-A1_1_S1_L001_R1_001.fastq.gz
    |   |-- SI-GA-A1_1_S1_L001_R2_001.fastq.gz
    |   `-- SI-GA-A1_1_S1_L001_R3_001.fastq.gz
    |-- SI-GA-A1_2
    |   |-- SI-GA-A1_2_S2_L001_I1_001.fastq.gz
    |   |-- SI-GA-A1_2_S2_L001_R1_001.fastq.gz
    |   |-- SI-GA-A1_2_S2_L001_R2_001.fastq.gz
    |   `-- SI-GA-A1_2_S2_L001_R3_001.fastq.gz
    |-- SI-GA-A1_3
    |   |-- SI-GA-A1_3_S3_L001_I1_001.fastq.gz
    |   |-- SI-GA-A1_3_S3_L001_R1_001.fastq.gz
    |   |-- SI-GA-A1_3_S3_L001_R2_001.fastq.gz
    |   `-- SI-GA-A1_3_S3_L001_R3_001.fastq.gz
    |-- SI-GA-A1_4
    |   |-- SI-GA-A1_4_S4_L001_I1_001.fastq.gz
    |   |-- SI-GA-A1_4_S4_L001_R1_001.fastq.gz
    |   |-- SI-GA-A1_4_S4_L001_R2_001.fastq.gz
    |   `-- SI-GA-A1_4_S4_L001_R3_001.fastq.gz
|-- Reports
|-- Stats
|-- Undetermined_S0_L001_I1_001.fastq.gz
|-- Undetermined_S0_L001_R1_001.fastq.gz
|-- Undetermined_S0_L001_R2_001.fastq.gz
`-- Undetermined_S0_L001_R3_001.fastq.gz

You probably want to be able to merge All samples from the SI-GA-A1 index into a single analysis. If you only run one index at a time, you will see a smaller number of reads than expected, which may translate to lower than expected coverage or cell count for the experiment.

Situation Line in libraries CSV

All samples (mkfastq)

fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,,Chromatin Accessibility
...

Process all SI-GA-A1 reads in a single analysis

fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,SI-GA-A1_1,Chromatin Accessibility
/PATH/TO/MKFASTQ_ID/outs/fastq_path,SI-GA-A1_2,Chromatin Accessibility
/PATH/TO/MKFASTQ_ID/outs/fastq_path,SI-GA-A1_3,Chromatin Accessibility
/PATH/TO/MKFASTQ_ID/outs/fastq_path,SI-GA-A1_4,Chromatin Accessibility
...

Only process first sample index

fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,SI-GA-A1_1,Chromatin Accessibility
...

Scenario: My ATAC FASTQs are in an output folder from mkfastq or bcl2fastq, in the same directory as the Reports and Stats folders

How did I get here?

An Illumina Experiment Manager-formatted samplesheet was used with either no entry or a blank entry for the Sample_Project column. Your hierarchy may look similar to this:

fastq_path
|-- Reports
|-- Stats
|-- test_sample_S1_L001_I1_001.fastq.gz
|-- test_sample_S1_L001_R1_001.fastq.gz
|-- test_sample_S1_L001_R2_001.fastq.gz
|-- test_sample_S1_L001_R3_001.fastq.gz
|-- test_sample_S1_L002_I1_001.fastq.gz
|-- test_sample_S1_L002_R1_001.fastq.gz
|-- test_sample_S1_L002_R2_001.fastq.gz
|-- test_sample_S1_L002_R3_001.fastq.gz
|-- test_sample_S1_L003_I1_001.fastq.gz
|-- test_sample_S1_L003_R1_001.fastq.gz
|-- test_sample_S1_L003_R2_001.fastq.gz
|-- test_sample_S1_L003_R3_001.fastq.gz
|-- Undetermined_S0_L001_I1_001.fastq.gz
...
`-- Undetermined_S0_L003_R3_001.fastq.gz

This is fine; you would use the same arguments as if the FASTQs were organized into subfolders within the output folder.

Situation Line in libraries CSV

All samples (mkfastq)

fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,,Chromatin Accessibility
...

All samples (bcl2fastq direct)

fastqs,sample,library_type
/PATH/TO/BCL2FASTQ_OUTPUT_DIR,,Chromatin Accessibility
...

Process test_sample only (mkfastq)

fastqs,sample,library_type
/PATH/TO/MKFASTQ_ID/outs/fastq_path,test_sample,Chromatin Accessibility
...

Scenario: The ATAC FASTQs are named like "MySample_S1_L001_I1_001.fastq.gz". I don't see Reports or Stats anywhere

How did I get here?

PROJECT_FOLDER
|-- MySample_S1_L001_I1_001.fastq.gz
|-- MySample_S1_L001_I2_001.fastq.gz
|-- MySample_S1_L001_R1_001.fastq.gz
|-- MySample_S1_L001_R2_001.fastq.gz
|-- MySample_S1_L002_I1_001.fastq.gz
|-- MySample_S1_L002_I2_001.fastq.gz
|-- MySample_S1_L002_R1_001.fastq.gz
|-- MySample_S1_L002_R2_001.fastq.gz

This is fine; since the files are named according to the bcl2fastq standard, you would use the same arguments as if the FASTQs were organized into a flow cell folder or mkfastq output folder.

Situation	Line in libraries CSV
All samples	fastqs,sample,library_type /PATH/TO/PROJECT_FOLDER,,Chromatin Accessibility ...
Process `MySample` only	fastqs,sample,library_type /PATH/TO/PROJECT_FOLDER,MySample,Chromatin Accessibility ...

My ATAC FASTQs are not named like any of the above examples

How did I get here?

It is likely that you received files that were processed through a proprietary LIMS system, which employs its own naming conventions.

It is highly likely that these files were initially processed with bcl2fastq, so you will need to rename the files in one of the following formats, once you track down their origin:

[Sample Name]_S1_L00[Lane Number]_[Read Type]_001.fastq.gz

Where Read Type is one of:

I1: Dual index i7 read (optional)
R1: Read 1
R2: Dual index i5 read
R3: Read 2

Alternatively, Cell Ranger ARC will also accept ATAC FASTQs in this format:

I1: Dual index i7 read (optional)
R1: Read 1
I2: Dual index i5 read
R2: Read 2

After you have renamed those files into that format, you'll use the following arguments:

Situation	Line in libraries CSV
All samples	fastqs,sample,library_type /PATH/TO/PROJECT_FOLDER,,Chromatin Accessibility ...
Process `SAMPLENAME` only	fastqs,sample,library_type /PATH/TO/PROJECT_FOLDER,SAMPLENAME,Chromatin Accessibility ...

Cell Ranger ARC

Loupe

10x Genomics
Chromium Single Cell Multiome ATAC + Gene Exp.

Specifying Input FASTQ Files for cellranger-arc count

FASTQs file naming convention

GEX FASTQs

ATAC FASTQs

Common scenarios for GEX FASTQ files

Where are your GEX FASTQ files?

How are your GEX FASTQ files named?

Scenario: My GEX FASTQs are in an output folder from mkfastq or bcl2fastq, in a subdirectory next to Reports and Stats folders, with expected sample name prefixes

If you ran mkfastq on the GEX flow cell

If you ran bcl2fastq directly on the GEX flow cell

Scenario: My GEX FASTQs are in an output folder from mkfastq or bcl2fastq, in the same directory as the Reports and Stats folders

Scenario: The GEX FASTQs are named like "MySample_S1_L001_I1_001.fastq.gz". I don't see Reports or Stats anywhere

My GEX FASTQs are not named like any of the above examples.

Common scenarios for ATAC FASTQ files

Where are your ATAC FASTQ files?

How are your ATAC FASTQ files named?

Scenario: My ATAC FASTQs are in an output folder from mkfastq or bcl2fastq, in a subdirectory next to Reports and Stats folders, with expected sample name prefixes

If you ran mkfastq on the ATAC flow cell

If you ran bcl2fastq directly on the ATAC flow cell

Scenario: My ATAC FASTQs are in an output folder from mkfastq or bcl2fastq, but there are multiple folders per sample index, like "SI-GA-A1_1" and "SI-GA-A1_2"

Scenario: My ATAC FASTQs are in an output folder from mkfastq or bcl2fastq, in the same directory as the Reports and Stats folders

Scenario: The ATAC FASTQs are named like "MySample_S1_L001_I1_001.fastq.gz". I don't see Reports or Stats anywhere

My ATAC FASTQs are not named like any of the above examples

About

Legal Notices

Resources

Headquarters

Social

Cell Ranger ARC

Loupe

10x GenomicsChromium Single Cell Multiome ATAC + Gene Exp.

Specifying Input FASTQ Files for cellranger-arc count

FASTQs file naming convention

GEX FASTQs

ATAC FASTQs

Common scenarios for GEX FASTQ files

Where are your GEX FASTQ files?

How are your GEX FASTQ files named?

Scenario: My GEX FASTQs are in an output folder from mkfastq or bcl2fastq, in a subdirectory next to Reports and Stats folders, with expected sample name prefixes

If you ran mkfastq on the GEX flow cell

If you ran bcl2fastq directly on the GEX flow cell

Scenario: My GEX FASTQs are in an output folder from mkfastq or bcl2fastq, in the same directory as the Reports and Stats folders

Scenario: The GEX FASTQs are named like "MySample_S1_L001_I1_001.fastq.gz". I don't see Reports or Stats anywhere

My GEX FASTQs are not named like any of the above examples.

Common scenarios for ATAC FASTQ files

Where are your ATAC FASTQ files?

How are your ATAC FASTQ files named?

Scenario: My ATAC FASTQs are in an output folder from mkfastq or bcl2fastq, in a subdirectory next to Reports and Stats folders, with expected sample name prefixes

If you ran mkfastq on the ATAC flow cell

If you ran bcl2fastq directly on the ATAC flow cell

Scenario: My ATAC FASTQs are in an output folder from mkfastq or bcl2fastq, but there are multiple folders per sample index, like "SI-GA-A1_1" and "SI-GA-A1_2"

Scenario: My ATAC FASTQs are in an output folder from mkfastq or bcl2fastq, in the same directory as the Reports and Stats folders

Scenario: The ATAC FASTQs are named like "MySample_S1_L001_I1_001.fastq.gz". I don't see Reports or Stats anywhere

My ATAC FASTQs are not named like any of the above examples

10x Genomics
Chromium Single Cell Multiome ATAC + Gene Exp.