Cell Ranger3.1, printed on 11/21/2024
Goal: To identify genes that distinguish between the healthy controls and acute myeloid leukemia sample.
First, switch into Categories mode in the sidebar if you have not already done so. Click on the category selector at the top of the sidebar, and select 'AMLStatus'. You can see that the cells are divided into 'Normal' and 'Patient' groups. Note: the AMLStatus was derived from the spreadsheet passed to cellranger aggr, as described here.
Next, let's separate these groups spatially. Find the bottom button on the toolbar. This is the Split View button, for grouping the active t-SNE plot by cluster. Click on it, and select the Current Category (AMLStatus) option. You should see something like this:
To exit Split View, click on the button one more time, and select the Disable Split View option.
Next, let's find out whether there are any genes that uniquely characterize these two groups. This is possible using the options at the bottom of the Categories Mode panel. There are two options, Significant Feature Comparison and Feature Type. The AMLTutorial dataset only has gene features, so we are left to choose between comparison options. They are:
For the Normal vs. Patient category, these are equivalent. However, if you want to compare the two Normal samples against each other within the LibraryID category (excluding the Patient sample), then you will need to choose Locally Distinguishing. On the flipside, if you want to find out what's unique about a single cluster of cells, you'll need to select Globally Distinguishing.
Let's select Locally Distinguishing this time. Click on the calculator button in the bottom right to begin the calculation. Depending on the number of cores on your machine and the speed of your hard drive, this should take between 5 and 20 seconds.
The time to complete a Significant Features analysis depends on the number of cells in the dataset, number of clusters being compared, and the speed of your CPU and hard drive. Loupe Cell Browser 3.1 limits calculating significant features to datasets of 100,000 cells or less, though for larger datasets you are still able to investigate the genes that drive differences between Graph-Based and K-Means clusters. |
Once complete, a new data table and heatmap will appear in the data panel:
This heatmap displays the top upregulating genes per cluster. Each column represents the level of expression of a significant feature, and each row represents a cluster. Grid cells are colored by a gene's log2 fold change in its cluster row, compared to the other clusters. Hovering over the columns will show the names of the features represented in each column.
By clicking on the Feature Table icon to the left of the heatmap, you'll be able to see significant gene information in a tabular view. For each cluster, you can view the most significantly upregulated genes per cluster by p-value, or click the Options button above the Feature Table to filter by more options.
Clicking on a column in the heatmap will show the expression level for that gene across the entire dataset. Within the Feature Table, clicking on a feature name will allow you to add it to a feature list, and the Set As Active Feature option will show the expression level of that feature across the dataset.
The AML cells are more highly expressed in markers associated with the proliferation and differentiaion of red blood cells, such as HBA1 and HBG. The normal cells have higher levels of expression in a variety of T and B cell markers.
These findings are consistent with the results in the Nature Communications publication, "Zheng et al, Massively parallel digital transcriptional profiling of single cells", which used these samples, among others.
Next, we will use the feature expression view to help us quickly identify cell types.