Component: NGS of full length HLA genes: Preliminary results of the Pilot Study

Lisa E Creary1, Steven J Mack2 and Marcelo Fernandez-Vina1
1
Department of Pathology, Stanford Blood Center
2Children’s Hospital Oakland Research Institute

Overview

The ultimate goal of the 17th International HLA and Immunogenetics Workshop (IHIW) is to advance the fields of Histocompatibility and Immunogenetics (H & I) research through the application of Next-Generation Sequencing (NGS) technologies for HLA and KIR genotyping, and to advance the development of NGS technologies tailored to meet the needs of the H & I community.

In 2014, we initiated an international multi-center pilot study in order to assess the performance of various NGS protocols, platforms, and software for full gene typing of classical class I (HLA-A, -B, -C) and class II (HLA-DPA1, -DPB1, -DQA1, -DQB1, -DRB1, -DRB3, -DRB4, -DRB5) genes.

The specific aims of this study were four-fold:

  • Evaluate the performance of different NGS protocols and platforms, and identify the limitations/nuances specific to each method.
  • Evaluate software programs for analysis of sequence data and assignment of HLA genotypes.
  • Inform the design of optimized methods for the exchange and storage of NGS data. A goal of the workshop is to store HLA genotyping data in a format for reanalysis. This format should allow for simple and systematic examination and comparison of genotypes obtained by different protocols and platforms.
  • Clone and sequence all class I and class II alleles in a quality control (QC) reference panel. These results will constitute an unambiguous reference for the evaluation of NGS reagent/platform combinations. In addition, these reference data will contribute to the necessary completion of full-gene sequences for common alleles and aid in identifying novel alleles resulting in unambiguous HLA genotypes.

 

Methods

We initially conducted a survey of twenty-five interested laboratories, using a questionnaire to gather information about the number and type of HLA genes that participants were able to sequence, NGS protocol and instrumentation used, software analysis packages (e.g. commercial or in-house) used, and type of output file formats. Fifty blinded QC cell line-derived genomic DNA samples, supplied by the Fred Hutchinson Cancer Research Center, Seattle, WA (http://www.ihwg.org) and collected in previous IHIWs were distributed to seventeen laboratories. These samples were genotyped by fifteen laboratories, or groups within the same laboratory applying different platforms and/or reagents, located worldwide (Table 1). The QC panel was selected to represent a wide range of HLA allele families and to include CWD alleles, rare alleles, null alleles, and samples homozygous for at least one locus. The QC cell lines had been typed previously by Sanger sequence Based Typing (SBT), sequence-specific primers (SSP), sequence-specific oligonucleotide (SSO) probes, serological and cellular methods for some but not all HLA genes. One of the fifteen laboratories also cloned and determined the nucleotide sequences for the majority of the HLA alleles of the QC panel. Genotype results were collated over a period of 10 months. Primary sequencing data (e.g. FASTQ files) were also collected from five laboratories.

 

Table 1. NGS HLA Pilot Study participating laboratories
Laboratory
Antony Nolan, London, UK
BFR, Beijing, China
GenDx, Utrecht, Netherlands
Georgetown University, Washington DC, USA
H&I Laboratory, Nantes, France
Royal Perth Hospital, Perth, Australia
Stanford Blood Center, Group 1, CA, USA
Stanford Blood Center, Group 2, CA, USA
Stanford Blood Center, Group 3, CA, USA
Transplantation and Immunology, Tuebingen, Germany
Transplantation Immunology, Ulm, Germany
UCLA, CA, USA
UNC-Chapel Hill, NC, USA
Uppsala University, Uppsala, Sweden
University of Vienna, Vienna, Austria

 

Table 2 shows the sequencing platforms utilized by different laboratories to perform NGS based HLA typing.

Table 2. NGS Platforms used by participating laboratories
Platform Number of laboratories
GS Roche Junior 1
Illumina MiSeq 8
Ion Torrent Personal Genome Machine (PGM) 4
Pacific Biosciences 2

 

Results

Genotyping

Genotyping results are shown in Tables 3 and 4. All 15 laboratories performed full-length gene sequencing of HLA-A, -B and -C alleles. For HLA-A, 32 different alleles were typed; two of these alleles were reported as novel intronic variants. HLA-B typing identified 54 alleles, including one allele with a novel exon variant and eight with novel intronic variants. HLA-C typing identified 31 alleles, of which two included novel exon variants. Only four laboratories typed DPA1, and collectively identified 11 different DPA1 alleles. DPB1 was genotyped by 10 laboratories and 19 unique alleles were sequenced. Thirteen laboratories typed DQB1, and six laboratories genotyped DQA1, identifying 19 and 23 unique alleles respectively. For DRB1, a total of 13 laboratories used different primer pairs that generated various ranges of gene coverage (full gene, exons 2 and 3, or exons 2, 3, 4 only), and identified a total of 39 unique alleles. Five laboratories typed DRB3 (number of alleles identified, n = 4), DRB4 (n = 4), and DRB5 (n = 3).

 

Table 3. Total number of heterozygous and homozygous alleles in the QC panel
HLA Locus Heterozygous Homozygous Total
A 72 28 100
B 78 22 100
C 72 28 100
DPA1 42 58 100
DPB1 54 46 100
DQA1 74 26 100
DQB1 68 32 100
DRB1 80 20 100
DRB3 10 48 58
DRB4 10 36 46
DRB5 0 12 12

 

Table 4. The number of unique and novel alleles identified by participating laboratories
Locus Alleles Novel Exon variants Novel Intron variants
A 32 0 2
B 54 1 8
C 31 2 0
DPA1 11 0 0
DPB1 19 0 0
DQA1 23 0 0
DQB1 19 0 0
DRB1 39 0 0
DRB3 4 0 0
DRB4 4 0 0
DRB5 3 0 0

 

Cloning

Cloning experiments to generate full-length unambiguous gene sequences were conducted at the Stanford Blood Center. Class I loci in all 50 QC samples were cloned and sequenced at full genomic-length using Illumina NGS. Cloning and sequencing extended or confirmed allele sequence diversity in all HLA loci. Novel alleles and alleles with extended genomic sequences were identified and generated for 12, 14, and 2 alleles in HLA-A, -B, and -C respectively. DPA1 and DPB1 novel/extended alleles were identified for 7 and 15 alleles respectively. Twenty DQA1 alleles and 9 DQB1 alleles were successfully cloned and full- genomic length sequenced. For DRB1, 100 alleles were cloned and sequenced from exons 1 to 2 and exons 2 through 6. Forty-one DRB1 alleles were identified as novel/extended. For DRB3, DRB4 and DRB5, 14, 6, and 6 alleles were found to be novel.

 

Concordance

Due to incomplete and low-resolution reference genotypes for the QC panel, concordance rates were calculated by comparing NGS genotypes from each individual laboratory with consensus genotypes across all laboratories (including cloned data). Testing laboratories were coded A through O (Table 5). Concordance was determined at 2-field resolution. Consensus assignments were high for most laboratories and for most HLA loci. Only one laboratory had a low concordance rate for one locus (DQA1).

 

  Table 5. Concordance rates of participants NGS genotype compared to consensus genotypes
Group HLA-A HLA-B HLA-C HLA-DPA HLA-DPB HLA-DQA HLA-DQB HLA-DRB1 HLA-DRB3 HLA-DRB4 HLA-DRB5
A 100 100 100 98 100 100 100 100 100 NC 100
B 100 100 100 99.0 100 98.0 99.0 98.0 100 NC 100
C 98.0 97.0 100 99.0 100 99.0 100 97.0 100 NC 83.3
D 99.0 100 100 100 100 98.0 100 98.0 100 NC 100
E 98.0 98.0 94.0 NT 100 28.0 100 100 100 NC 100
F 93.2 98.9 100 NT 98.9 98.9 97.8 93.3 NT NT NT
G 100 100 100 NT 100 NT 100 100 NT NT NT
H 100 100 100 NT NT NT 100 100 NT NT NT
I 100 100 100 NT 99.0 NT 98.0 98.0 NT NT NT
J 96.6 93.0 95.5 NT 95.9 NT 94.4 90.9 NT NT NT
K 100 100 100 NT NT NT 100 92.7 NT NT NT
L 100 94.0 100 NT NT NT 95.0 94.0 NT NT NT
M 97.6 96.3 95.0 NT 100 NT 96.8 96.9 NT NT NT
N 100 100 100 NT NT NT NT NT NT NT NT
O 98.0 96.0 98.0 NT NT NT NT NT NT NT NT

 

NT = not tested

NC = not calculated unable to generate a consensus genotype.

 

The results of this study allow us to conclude that HLA typing by various methods is feasible and accurate results can be obtained by various library preparation protocols, platforms and software. We have applied the significant information gained in this effort to design the 17th IHIWS database, as well as to develop strategies for collection of HLA genotype data and the storage of primary data to perform re-analyses as a workshop activity.

We thank the Fred Hutchinson Cancer Research center for supplying the QC samples and all laboratories that participated in the study.

Share: Share on FacebookTweet about this on TwitterShare on LinkedInShare on Google+Email this to someonePrint this page