4.3.2. Targeted long-read sequencing using a Nanopore sequencer with adaptive sampling conducted at our institution
We also have data on self-experiments conducted using a nanopore sequencer in the preclinical workup of PGT-M. These cases involved chromosomal structural abnormalities for which breakpoint detection was challenging using conventional cytogenetic testing, and we attempted structural analysis using long-read sequencing. In particular, we reported a case involving complex chromosome X structural abnormalities (
Figure 1) [
24]. We performed targeted sequencing on chromosome X using the adaptive sampling method [
49,
50] implemented in the desktop model GridION of the nanopore series (
https://nanoporetech.com/gridion). Adaptive sampling is a nanopore-specific technology based on Readfish software [
50] that allows the real-time selection of target sequences while sequencing a DNA library. By specifying the targets in a FASTA file, it is possible to easily perform targeted sequencing without requiring specialized library adjustments and to obtain sufficient sequencing depths for genomic structural analysis. Adaptive sampling has already demonstrated utility in the analysis of the human genome, being applied to the diagnosis of Mendelian diseases with missing pathogenic variants [
44,
51,
52] and targeting sequences for hereditary tumors [
53,
54]. Furthermore, its applicability extends to non-human genomes including metagenome analysis [
55], showcasing a broad range of applications.
The case shown in
Figure 1 is a case of Pelizaeus–Merzbach disease (PMD) caused by a duplication of the
PLP1 gene via complex genomic rearrangement, and it is the first case wherein a long read sequencer was used in the setup of PGT-M at our institute. PMD is known to frequently exhibit complex genomic rearrangements, such as a DUP-TRP/INV-DUP structure, formed through flanking segmental duplications [
56,
57]. As the positions of junctions vary widely, the direct confirmation of the correct junction location requires the combination of cytogenetical analysis methods and is time consuming. Despite the complex chromosomal rearrangement, we were amazed that we directly detected the breakpoints and junctions in a single sequencing run using Nanopore sequencing (
Figure 1A) [
24]. The sequenced reads of the folded rearrangement of Junction1 were challenging to map due to the presence of segmental duplications with over 99.9% similarity. However, the genomic structure could be readily surmised from the sequence data (
Figure 1B). By designing a specific PCR method based on the sequenced data from the reads spanning Junction2, we were able to design a direct method for detecting the pathogenic variants only in the carrier mother and previous child (
Figure 1C). Additionally, similar to the previous report, we attempted to employ a method that involves the simultaneous detection of structural variations or pathogenic SNVs along with the detection of surrounding SNPs for haplotyping. The previously reported methods allowed for the detection of pathogenic variants, as well as the simultaneous detection of surrounding SNPs and haplotyping, using a high-read-depth sequence with PromethION, SMRT sequencing or target amplicon sequences [
19,
20,
21,
22,
23]. However, due to the exorbitant cost per sequence of high-throughput long read sequencers, routine clinical use of this approach for PGT-M is challenging. It remains unclear whether adaptive sampling using the relatively cost-effective GridION sequencer can be utilized for the preclinical workup of PGT-M.
We carried out long-read sequencing using GridION with adaptive sampling to detect pathogenic variants and surrounding SNPs for haplotyping at the same time in several cases. The first case involved Ornithine Transcarbamylase Deficiency (OTC;
Figure 2). The proband's wife carried a de novo pathogenic missense variant (OTC:c.643C>T) in the
OTC gene. In this case, we set the target FASTA for adaptive sampling to include the
OTC gene and its surrounding region within 100kbp upstream and downstream. Ornithine transcarbamylase (OTC) deficiency is an X-linked genetic disorder affecting the urea cycle, leading to the accumulation of ammonia and causing neurological deficit [
58,
59], and the couple wished for PGT-M. Since the proband had no affected offspring in this case, implementing preclinical workup for haplotyping was challenging. We previously performed haplotyping of pathogenic variants using STR markers for PGT-M via trio genetic analysis of couples and their embryos in de novo cases. However, when using this method, it is difficult to haplotype the post-zygotic de novo germline mosaicism of parents precisely [
12]. We believed that long-read sequencing could address challenges associated with de novo cases, as mentioned earlier, even for the adaptive sampling method. Informative reads around the pathogenic variant are shown in the figure (
Figure 2A). SNP candidates distinguishing pathogenic and benign alleles for use in PGT-M were identified upstream and downstream. Validation using Sanger sequencing confirmed the existence of each SNP, and primer design was carried out in preparation for embryo testing (
Figure 2B). The result of the proband's SNP haplotyping is presented in
Figure 2C. This adaptive-sampling-based long-read sequencing method was expected to improve the specificity and reliability of the test compared to the STR haplotyping we traditionally employed since it allows for a more precise search for SNP haplotyping markers in the vicinity. Moreover, a preclinical workup using adaptive sampling was conducted for Duchenne muscular dystrophy (DMD), with one case shown in
Figure 3. In this case, we set the target FASTA for adaptive sampling to include the
DMD gene and its surrounding region within 5kbp upstream and downstream. Although an exon2-44 deletion was confirmed via MLPA in this case, the detailed genomic position of the deletion region was unclear, making the direct detection of the pathogenic variant via PGT-M challenging. Multiple attempts were made using long-range PCR, but the wide intron region made it difficult to identify the junction. The identified deletion and surrounding genomic regions are shown in
Figure 3A, where a decrease in read depth in the deletion region is evident in IGV view. The observation of discordant reads (
Figure 3B,C) facilitated the easy identification of junctions spanning the deletion and surrounding SNPs, allowing for simultaneous haplotyping (
Figure 3D) and structural analysis. We believe that a preclinical workup for PGT-M using adaptive sampling with GridION, as well as high-throughput sequencing with PromethION or a Sequel system, is also feasible.
In the implementation of PGT-M in Japan, both the direct detection of pathogenic variants and the haplotyping of pathogenic variants using informative STRs or SNPs are required for each case. This is because the number of cells obtained via biopsy is very low, typically 5-10 cells, and the process of whole-genome amplification is necessary [
5,
7,
12]. In whole-genome amplification, it is necessary to consider allele dropout and amplification bias, and relying on a single marker can lead to misdiagnosis. Therefore, a combination of variant detection methods is employed to ensure the accuracy of PGT-M testing. However, as indicated in the guidelines issued by the ESHRE [
12], many countries and facilities also permit embryo determination based solely on haplotyping with some additional conditions. In our self-experiments, we encountered a case with complex chromosomal rearrangements for which junction detection was difficult, and even haplotyping using STR markers was challenging. In this case, an intrachromosomal insertion had occurred, and the duplicated region involving
MECP2 was inserted 45 Mb proximal to the original position [
24]. If PGT-M was performed only via STR haplotyping at the original
MECP2 site in this case, it may have led to misdiagnosis due to meiotic recombination. A preclinical workup of PGT-M using long-read sequencing may help minimize the risk of such a misdiagnosis.