Immune Repertoire Sequencing Technologies
Immune Repertoire Sequencing Technologies
The ability to sequence immune receptors has revolutionized our understanding of adaptive immunity. This guide covers the major technologies used to characterize T cell receptor (TCR) and B cell receptor (BCR) repertoires, their strengths, limitations, and applications.
Overview
Immune repertoire sequencing (also called Rep-seq, AIRR-seq, or immunosequencing) aims to comprehensively profile the adaptive immune receptors in a sample. The field has evolved rapidly from early PCR-based methods to sophisticated single-cell approaches.
Key Challenges
- Diversity: Theoretical diversity of 10^15-10^18 unique receptors
- Chain pairing: Native α-β (TCR) or heavy-light (BCR) pairs are lost in bulk methods
- Dynamic range: Clones span frequencies from less than 0.001% to greater than 10%
- PCR bias: Multiplex primers can amplify some sequences preferentially
- Sequencing errors: Distinguishing real variants from artifacts
Bulk Sequencing Methods
Multiplex PCR-Based Methods
The most widely used approach for routine repertoire analysis.
Principle:
- Extract RNA or DNA from sample
- Reverse transcribe (if RNA)
- Amplify V(D)J regions using multiplex primers
- Add sequencing adapters
- Sequence on Illumina platform
- Bioinformatic analysis
Primer Strategies:
| Strategy | Advantages | Disadvantages |
|---|---|---|
| V + C primers | Captures full V region | Requires many V primers, bias |
| V + J primers | Shorter amplicon | More primers needed, 5’ truncation |
| 5’ RACE + C | Unbiased V capture | Lower efficiency, more complex |
Commercial Platforms:
- Adaptive Biotechnologies (immunoSEQ) - DNA-based, proprietary primers
- iRepertoire - Multiplex PCR, arm-PCR technology
- BGI Immune Profiling - Multiplex RNA-based
Typical Specifications:
- Input: 100 ng - 1 μg DNA/RNA, or 10^5-10^7 cells
- Depth: 10^5-10^7 reads
- Cost: $100-500 per sample
- Turnaround: 1-4 weeks
5’ RACE-Based Methods
Principle: Uses template-switching to add a universal 5’ adapter, avoiding V gene primer bias.
Advantages:
- Unbiased V gene capture
- Full-length V region sequence
- No multiplex primer design needed
Disadvantages:
- Lower efficiency than multiplex PCR
- Requires RNA (not DNA)
- More technically demanding
UMI-Based Error Correction
Unique Molecular Identifiers (UMIs) are random nucleotide sequences added to each original molecule before amplification.
Benefits:
- Distinguish PCR duplicates from biological replicates
- Enable accurate quantification
- Correct sequencing errors through consensus building
Implementation:
Original molecule → Add UMI → PCR amplification → Sequencing
↓
Multiple copies share same UMI
↓
Consensus sequence = true sequence
Single-Cell Methods
Droplet-Based (10x Genomics Chromium)
The current gold standard for paired receptor sequencing at scale.
Principle:
- Encapsulate single cells in gel beads (GEMs)
- Cell lysis and mRNA capture on barcoded beads
- Reverse transcription with cell barcode
- Pool, amplify, sequence
- Demultiplex by cell barcode
Specifications:
- Throughput: 1,000-10,000+ cells per sample
- Pairing accuracy: greater than 97%
- Cost: $800-1,500 per sample
- Requires fresh/viably frozen cells
Products:
- 10x Genomics V(D)J + Gene Expression
- 10x Genomics V(D)J + Feature Barcoding (CITE-seq compatible)
Plate-Based Methods
Principle: Sort single cells into wells, perform RT-PCR with well-specific barcodes.
Examples:
- Smart-seq-based protocols
- MAD-HYPE (Massively parallel determination of paired chains)
Advantages:
- Full-length sequences
- Can combine with index sorting (pre-sort phenotyping)
- Lower per-cell cost at small scale
Disadvantages:
- Lower throughput (384-1,536 cells typical)
- More hands-on time
- Requires cell sorter
Emulsion-Based (Droplet PCR)
Principle: Perform RT-PCR in emulsion droplets to link chains physically before bulk amplification.
Approaches:
- Paired-chain amplification in droplets
- Gel bead barcoding
Microwell-Based (BD Rhapsody)
Principle: Capture single cells in microwells with barcoded beads.
Features:
- ~10,000 cells per cartridge
- Combined with transcriptome analysis
- Lower capture efficiency than droplet methods
Computational Approaches to Chain Pairing
When single-cell methods are too expensive or cell numbers are limiting, computational approaches can infer pairing from bulk data.
Statistical Co-occurrence
Principle: If α and β chains are from the same clone, they should co-occur across samples/wells.
Method:
- Distribute cells across multiple samples (e.g., replicates, wells)
- Perform bulk sequencing on each
- Look for chains that consistently appear together
- Statistical tests identify significant pairings
Limiting Dilution + Sequencing
Principle: Dilute cells to near-single-cell concentrations, sequence in bulk, infer pairs from co-occurrence.
Advantages:
- Much lower cost than single-cell
- Works with limited cell numbers
- Scalable
Considerations:
- Requires careful experimental design
- Statistical confidence depends on dilution scheme
- May miss very rare clones
Sequencing Platforms
Illumina (Short-Read)
- Most common platform for immune repertoire studies
- MiSeq: Lower throughput, longer reads (2×300 bp)
- NovaSeq: High throughput, shorter reads (2×150 bp)
- Well-established bioinformatics pipelines
PacBio/ONT (Long-Read)
Advantages:
- Full-length V(D)J in single reads
- Phase haplotypes
- Detect structural variants
Challenges:
- Higher error rate (improving)
- Lower throughput
- More expensive per read
Applications:
- Full-length BCR sequencing
- Resolving complex rearrangements
- Phasing somatic mutations
Bioinformatics Overview
Core Analysis Steps
- Quality control: Trim adapters, filter low-quality reads
- UMI processing: Group by UMI, build consensus
- V(D)J annotation: Align to germline references (IMGT, OGRDB)
- CDR3 extraction: Identify junction sequence
- Clonotype calling: Group sequences into clones
- Repertoire metrics: Diversity, clonality, V gene usage
Key Tools
| Tool | Function | Link |
|---|---|---|
| MiXCR | Full pipeline, V(D)J alignment | mixcr.com |
| IMGT/HighV-QUEST | Reference V(D)J annotation | imgt.org |
| IgBLAST | NCBI’s V(D)J aligner | ncbi.nlm.nih.gov/igblast |
| Immcantation | BCR analysis suite | immcantation.readthedocs.io |
| scRepertoire | Single-cell repertoire analysis | github.com/ncborcherding/scRepertoire |
| immunarch | R package for analysis | immunarch.com |
| VDJtools | Repertoire manipulation | github.com/mikessh/vdjtools |
Data Standards
AIRR Community Standards define:
- Required fields for repertoire data
- File formats (AIRR TSV)
- Minimal reporting requirements
- Germline database standards
Choosing the Right Method
| Application | Recommended Method | Rationale |
|---|---|---|
| MRD monitoring | Bulk deep sequencing | Need high sensitivity, known clone |
| Vaccine response | Bulk + UMIs | Population-level changes |
| Autoimmune disease | Single-cell | Need paired chains for specificity |
| TCR-T development | Single-cell | Must have correct pairing |
| Transplant monitoring | Bulk or paired | Depends on clinical question |
| Large cohort studies | Bulk | Cost-effective at scale |
| Rare cell analysis | Single-cell | Preserve chain pairing |
Emerging Technologies
Spatial Transcriptomics + Repertoire
- Map clonotypes to tissue locations
- Understand immune architecture
- Technologies: 10x Visium, NanoString CosMx
Long-Read Single-Cell
- PacBio Kinnex + single-cell barcoding
- Full-length isoform resolution
Direct RNA Sequencing
- Oxford Nanopore
- No reverse transcription bias
- Detect RNA modifications
Further Reading
- Chain Pairing Problem - Why pairing matters
- V(D)J Recombination - Receptor generation
- TCR Structure - T cell receptor architecture
- BCR Structure - B cell receptor architecture
- Resources - Databases and tools