Palatine tonsils are secondary lymphoid organs (SLOs) representing the first line of immunological defense against inhaled or ingested pathogens. We generated an atlas of the human tonsil composed of >556,000 cells profiled across five different data modalities, including single-cell transcriptome, epigenome, proteome, and immune repertoire sequencing, as well as spatial transcriptomics. This census identified 121 cell types and states, defined developmental trajectories, and enabled an understanding of the functional units of the tonsil. Exemplarily, we stratified myeloid slan-like subtypes, established a BCL6 enhancer as locally active in follicle-associated T and B cells, and identified SIX5 as putative transcriptional regulator of plasma cell maturation. Analyses of a validation cohort confirmed the presence, annotation, and markers of tonsillar cell types and provided evidence of age-related compositional shifts. We demonstrate the value of this resource by annotating cells from B cell-derived mantle cell lymphomas, linking transcriptional heterogeneity to normal B cell differentiation states of the human tonsil.
This repository contains all the scripts, notebooks and reports to reproduce all analysis from our manuscript entitled An Atlas of Cells in the Human Tonsil, published in Immunity in 2024. Here, we describe how to access the data, document the most important packages and versions used, and explain how to navigate the directories and files in this repository.
The data has been deposited in five levels of organization, from raw to processed data:
- Level 1: raw data. All fastq files for all data modalities have been deposited at ArrayExpress under accession id E-MTAB-13687.
- Level 2: matrices. All data modalities correspond to different technologies from 10X Genomics. As such, they were mapped with different flavors of CellRanger (CR). The most important files in the ‘‘outs’’ folder of every CR run (including all matrices) have been deposited in Zenodo.
- Level 3: Seurat Objects. All data was analyzed within the Seurat ecosystem. We have archived in Zenodo all Seurat Objects that contain the raw and processed counts, dimensionality reductions (PCA, Harmony, UMAP), and metadata needed to reproduce all figures from this manuscript.
- Level 4: to allow for programmatic and modular access to the whole tonsil atlas dataset, we developed HCATonsilData, available on BioConductor. HCATonsilData provides a vignette which documents how to navigate and understand the data. It also provides access to the glossary to traceback all annotations in the atlas. In addition, we will periodically update the annotations as we refine it with suggestions from the community.
- Level 5: interactive mode. Our tonsil atlas has been included as a reference in Azimuth, which allows interactive exploration of cell type markers on the web.
We refer to the READMEs in the Zenodo repositories for an explanation of how to access the matrices and Seurat objects. We have a separate repository (TonsilAtlasCAP) with scripts and documentation to download and remap all the fastq files from ArrayExpress
- Seurat 3.2.0 and 4.1.0
- Signac 1.1.0
- harmony 1.0
- lisi 1.0
- scrublet 0.2.1
- UCell 1.99.7
- clusterProfiler 4.3.4
- pySCENIC 0.10.3
- CellPhoneDB 3
- chromVar 1.1.0
- JASPAR2020 0.99.10
- chromVARmotifs 0.2.0
- Vireo 0.5.0
- Scirpy 0.7.0
- SPOTlight 0.1.7
- Rmagic 2.0.3
- SPATA2 0.1.0
You can check the versions of other packages at the "Session Information" section of each html report. To visualize one of the html reports online, you can copy&paste the URL of the report directly into the HTML GitHub viewer.
Although each technology requires specific analysis, they also share a similar pre-processing pipeline. We have strived to harmonize these pipelines into similar naming schemes so that it is easy for users to navigate this repo. Likewise, we have tried to code in a shared style. These are the most important steps:
-
1-cellranger_mapping: scripts used to run cellranger in our cluster. It also contains QC metrics for different sequencing runs.
-
2-QC: quaity control for the sequencing and mapping of raw data, filtering of poor-quality cells and genes, normalization, doublet detection, and batch effect correction.
-
3-clustering: we followed a top-down clustering approach (see methods of our manuscript). Thus, the clustering is organized by levels, in which we move from general cellular compartments to granular cell types and states in a hierarchical and recursive fashion. In this notebooks we have also included the annotation, which was established in collaboration with the annotation team.
-
4-revision: we include one folder with the analysis performed to answer each of the major reviews during the revision of the manuscript. Other more focused analysis include:
In addition, the "figures_and_scripts" folder contains the scripts used to generate most of the figures in the manuscript. Finally, the "bin" folder contains functions and utilities used throughout many scripts.
You can download a copy of all the files in this repository by cloning the git repository:
git clone https://github.com/Single-Cell-Genomics-Group-CNAG-CRG/TonsilAtlas.git
- Single-cell analysis of human B cell maturation predicts how antibody class switching shapes selection dynamics
- Dynamics of B cells in germinal centres
- The generation of antibody-secreting plasma cells
- Integrated single-cell transcriptomics and epigenomics reveals strong germinal center–associated etiology of autoimmune risk loci
- Characterization of human FDCs reveals regulation of T cells and antigen presentation to B cells
- Bcl6-Mediated Transcriptional Regulation of Follicular Helper T cells (TFH)
- A distinct subpopulation of CD25− T-follicular regulatory cells localizes in the germinal centers
- T Follicular Helper Cell Biology: A Decade of Discovery and Diseases
- Innate Lymphoid Cells: 10 Years On
- Single-cell RNA-seq reveals new types of human blood dendritic cells, monocytes, and progenitors
- A cell atlas of human thymic development defines T cell repertoire formation
- Deciphering the fate of slan+-monocytes in human tonsils by gene expression profiling
- FDC-SP, a Novel Secreted Protein Expressed by Follicular Dendritic Cells