Immune Repertoire Sequencing Technologies

Immune Repertoire Sequencing Technologies

The ability to sequence immune receptors has revolutionized our understanding of adaptive immunity. This guide covers the major technologies used to characterize T cell receptor (TCR) and B cell receptor (BCR) repertoires, their strengths, limitations, and applications.

Overview

Immune repertoire sequencing (also called Rep-seq, AIRR-seq, or immunosequencing) aims to comprehensively profile the adaptive immune receptors in a sample. The field has evolved rapidly from early PCR-based methods to sophisticated single-cell approaches.

Key Challenges

  1. Diversity: Theoretical diversity of 10^15-10^18 unique receptors
  2. Chain pairing: Native α-β (TCR) or heavy-light (BCR) pairs are lost in bulk methods
  3. Dynamic range: Clones span frequencies from less than 0.001% to greater than 10%
  4. PCR bias: Multiplex primers can amplify some sequences preferentially
  5. Sequencing errors: Distinguishing real variants from artifacts

Bulk Sequencing Methods

Multiplex PCR-Based Methods

The most widely used approach for routine repertoire analysis.

Principle:

  1. Extract RNA or DNA from sample
  2. Reverse transcribe (if RNA)
  3. Amplify V(D)J regions using multiplex primers
  4. Add sequencing adapters
  5. Sequence on Illumina platform
  6. Bioinformatic analysis

Primer Strategies:

StrategyAdvantagesDisadvantages
V + C primersCaptures full V regionRequires many V primers, bias
V + J primersShorter ampliconMore primers needed, 5’ truncation
5’ RACE + CUnbiased V captureLower efficiency, more complex

Commercial Platforms:

  • Adaptive Biotechnologies (immunoSEQ) - DNA-based, proprietary primers
  • iRepertoire - Multiplex PCR, arm-PCR technology
  • BGI Immune Profiling - Multiplex RNA-based

Typical Specifications:

  • Input: 100 ng - 1 μg DNA/RNA, or 10^5-10^7 cells
  • Depth: 10^5-10^7 reads
  • Cost: $100-500 per sample
  • Turnaround: 1-4 weeks

5’ RACE-Based Methods

Principle: Uses template-switching to add a universal 5’ adapter, avoiding V gene primer bias.

Advantages:

  • Unbiased V gene capture
  • Full-length V region sequence
  • No multiplex primer design needed

Disadvantages:

  • Lower efficiency than multiplex PCR
  • Requires RNA (not DNA)
  • More technically demanding

UMI-Based Error Correction

Unique Molecular Identifiers (UMIs) are random nucleotide sequences added to each original molecule before amplification.

Benefits:

  • Distinguish PCR duplicates from biological replicates
  • Enable accurate quantification
  • Correct sequencing errors through consensus building

Implementation:

Original molecule → Add UMI → PCR amplification → Sequencing

                    Multiple copies share same UMI

                    Consensus sequence = true sequence

Single-Cell Methods

Droplet-Based (10x Genomics Chromium)

The current gold standard for paired receptor sequencing at scale.

Principle:

  1. Encapsulate single cells in gel beads (GEMs)
  2. Cell lysis and mRNA capture on barcoded beads
  3. Reverse transcription with cell barcode
  4. Pool, amplify, sequence
  5. Demultiplex by cell barcode

Specifications:

  • Throughput: 1,000-10,000+ cells per sample
  • Pairing accuracy: greater than 97%
  • Cost: $800-1,500 per sample
  • Requires fresh/viably frozen cells

Products:

  • 10x Genomics V(D)J + Gene Expression
  • 10x Genomics V(D)J + Feature Barcoding (CITE-seq compatible)

Plate-Based Methods

Principle: Sort single cells into wells, perform RT-PCR with well-specific barcodes.

Examples:

  • Smart-seq-based protocols
  • MAD-HYPE (Massively parallel determination of paired chains)

Advantages:

  • Full-length sequences
  • Can combine with index sorting (pre-sort phenotyping)
  • Lower per-cell cost at small scale

Disadvantages:

  • Lower throughput (384-1,536 cells typical)
  • More hands-on time
  • Requires cell sorter

Emulsion-Based (Droplet PCR)

Principle: Perform RT-PCR in emulsion droplets to link chains physically before bulk amplification.

Approaches:

  • Paired-chain amplification in droplets
  • Gel bead barcoding

Microwell-Based (BD Rhapsody)

Principle: Capture single cells in microwells with barcoded beads.

Features:

  • ~10,000 cells per cartridge
  • Combined with transcriptome analysis
  • Lower capture efficiency than droplet methods

Computational Approaches to Chain Pairing

When single-cell methods are too expensive or cell numbers are limiting, computational approaches can infer pairing from bulk data.

Statistical Co-occurrence

Principle: If α and β chains are from the same clone, they should co-occur across samples/wells.

Method:

  1. Distribute cells across multiple samples (e.g., replicates, wells)
  2. Perform bulk sequencing on each
  3. Look for chains that consistently appear together
  4. Statistical tests identify significant pairings

Limiting Dilution + Sequencing

Principle: Dilute cells to near-single-cell concentrations, sequence in bulk, infer pairs from co-occurrence.

Advantages:

  • Much lower cost than single-cell
  • Works with limited cell numbers
  • Scalable

Considerations:

  • Requires careful experimental design
  • Statistical confidence depends on dilution scheme
  • May miss very rare clones

Sequencing Platforms

Illumina (Short-Read)

  • Most common platform for immune repertoire studies
  • MiSeq: Lower throughput, longer reads (2×300 bp)
  • NovaSeq: High throughput, shorter reads (2×150 bp)
  • Well-established bioinformatics pipelines

PacBio/ONT (Long-Read)

Advantages:

  • Full-length V(D)J in single reads
  • Phase haplotypes
  • Detect structural variants

Challenges:

  • Higher error rate (improving)
  • Lower throughput
  • More expensive per read

Applications:

  • Full-length BCR sequencing
  • Resolving complex rearrangements
  • Phasing somatic mutations

Bioinformatics Overview

Core Analysis Steps

  1. Quality control: Trim adapters, filter low-quality reads
  2. UMI processing: Group by UMI, build consensus
  3. V(D)J annotation: Align to germline references (IMGT, OGRDB)
  4. CDR3 extraction: Identify junction sequence
  5. Clonotype calling: Group sequences into clones
  6. Repertoire metrics: Diversity, clonality, V gene usage

Key Tools

ToolFunctionLink
MiXCRFull pipeline, V(D)J alignmentmixcr.com
IMGT/HighV-QUESTReference V(D)J annotationimgt.org
IgBLASTNCBI’s V(D)J alignerncbi.nlm.nih.gov/igblast
ImmcantationBCR analysis suiteimmcantation.readthedocs.io
scRepertoireSingle-cell repertoire analysisgithub.com/ncborcherding/scRepertoire
immunarchR package for analysisimmunarch.com
VDJtoolsRepertoire manipulationgithub.com/mikessh/vdjtools

Data Standards

AIRR Community Standards define:

  • Required fields for repertoire data
  • File formats (AIRR TSV)
  • Minimal reporting requirements
  • Germline database standards

Choosing the Right Method

ApplicationRecommended MethodRationale
MRD monitoringBulk deep sequencingNeed high sensitivity, known clone
Vaccine responseBulk + UMIsPopulation-level changes
Autoimmune diseaseSingle-cellNeed paired chains for specificity
TCR-T developmentSingle-cellMust have correct pairing
Transplant monitoringBulk or pairedDepends on clinical question
Large cohort studiesBulkCost-effective at scale
Rare cell analysisSingle-cellPreserve chain pairing

Emerging Technologies

Spatial Transcriptomics + Repertoire

  • Map clonotypes to tissue locations
  • Understand immune architecture
  • Technologies: 10x Visium, NanoString CosMx

Long-Read Single-Cell

  • PacBio Kinnex + single-cell barcoding
  • Full-length isoform resolution

Direct RNA Sequencing

  • Oxford Nanopore
  • No reverse transcription bias
  • Detect RNA modifications

Further Reading