2.1. DDX41
RNA helicases are a series of enzymes that remodel RNA-RNA or RNA-protein interactions in an NTP-dependent manner. Humans have more than 70 helicases that are classified into superfamily (SF) 1 and SF2 based on differences in sequence motifs within the helicase core domain [
18,
19]. SF1 includes Upf1-like RNA helicases, while SF2 includes the DEAD-box, DEAH-box/RNA helicase A-like, Ski2-like and RIG-I-like families, with the DEAD-box family RNA helicases being the most numerous. While the DEAH-box RNA helicases are thought to translocate along the substrate RNA for remodeling, DEAD-box RNA helicases unwind substrate RNA locally; the mechanism of action of each is thus different, but they both play roles in virtually all processes that require RNA conformational changes, such as RNA transport, translation, RNA degradation, RNA splicing, and ribosome synthesis. As a single RNA helicase often exerts enzymatic activity in multiple cellular processes, it remains difficult to fully elucidate the pathogenesis of diseases due to abnormalities in RNA helicases.
In myeloid neoplasms, pathogenic variants in the gene encoding DDX41, a DEAD-box RNA helicase, are found in about 5% of cases [
20]. It was recently shown that up to 13% of myeloid neoplasms have a genetic background [
21], of which
DDX41 variants account for about 80% of cases; MDS and AML occur in individuals with a heterozygous germline frameshift variant or a missense variant within the DEAD-box domain of
DDX41 by later acquiring a somatic variant in the other allele, typically p.R525H (or p.G530D, etc. in a few cases) within the helicase domain [
20,
22,
23] (
Figure 1A). While many myeloid neoplasms with a genetic background develop at younger ages than those without a known genetic background, myeloid neoplasms with
DDX41 variants are characterized by a late disease onset (mean age, 65 years) [
24,
25], which may have hindered identification of this gene as one of the genes responsible for genetic predisposition for myeloid leukemogenesis. In addition, the disease with a
DDX41 variant is characterized by male dominancy, fewer proliferating tumor cells, hypoplastic bone marrow, and unique co-existing gene mutational patterns as compared to those in other myeloid neoplasms [
26,
27], with only
DDX41 variants being often identified in many cases [
20], suggesting a unique disease pathogenesis of myeloid neoplasms with
DDX41 variants. In contrast, the disease phenotype may differ between cases with a single
DDX41 variant and biallelic variants [
28], and a report suggest that there is no clear difference in disease phenotype between cases with known pathogenic
DDX41 variants and variants of unknown significance (VUS) [
29]. Consequently, it is necessary to establish an validation system and database that can accurately interpret the significance of individual variant.
Figure 1.
Involvement of DDX41 variants in myeloid leukemogenesis.
Figure 1.
Involvement of DDX41 variants in myeloid leukemogenesis.
The prognosis of myeloid neoplasms with
DDX41 variants is not necessarily worse than those without a known genetic background, regardless of the tendency to be categorized as high-risk. However, the development of disease at advanced ages often makes intensive treatment difficult. Several cases of donor-derived secondary leukemia in patients who received allogeneic hematopoietic stem cell transplantation (HSCT) have been reported [
30,
31,
32,
33], thus treatment decisions require careful consideration of genetic background. Recent reports describe the development of acute lymphocytic leukemia and solid cancers in individuals with
DDX41 variants [
34,
35], but the extent to which
DDX41 variants are involved in such diseases remains controversial [
23].
DDX41 has been shown to be essential for hematopoiesis, with homozygous
Ddx41 knockout mice being embryonic lethal, although heterozygous mice show no remarkable abnormalities [
36,
37]. Several mechanisms have been proposed for the actions of
DDX41 variants in the development of myeloid neoplasms. It has been reported that R-loop, a nucleic acid structure on the genome consisting of a DNA:RNA hybrid and single-strand DNA, aberrantly accumulates in MDS with RNA splicing abnormalities, regardless of the type of responsible gene [
38,
39,
40,
41], and that R-loop accumulation causes DNA replication stress, DNA damage, and abnormal mitosis. Recently,
DDX41 has also been shown to be involved in R-loop regulation [
42,
43,
44], and it is suggested that R-loop accumulation due to dysfunction or decreased expression of
DDX41 is involved in impaired hematopoiesis and aberrant innate immune responses (
Figure 1B). One of the major functions of
DDX41 is RNA splicing [
45]. However, considering that
DDX41 variants develop de novo AML in addition to MDS,
DDX41 is thought to play different roles from those of typical RNA splicing factors associated with MDS development. Indeed, while SRSF2, SF3B1, and U2AF1 are all involved in the recognition of pre-mRNA 3’ splice sites with U2 snRNP [
46],
DDX41 has been shown to be incorporated into the spliceosome at the C complex stage, a late complex of the activated spliceosome [
44,
47]. Regarding the relationship between
DDX41 and R-loops, there are reports showing that
DDX41 can unwind R-loops on its own [
43,
48], while it has also been suggested that impaired
DDX41 function leads to reduced efficiency of RNA splicing, thus resulting in conditions that facilitate R-loop formation [
44]. The accumulation of R-loop has been shown to give rise to an excessive innate immune reaction mediated through the cGAS-STING signaling pathway, consequently inducing increased hematopoietic stem/progenitor cells [
42]. However, the mechanisms by which R-loops activate the cGAS-STING pathway remain inconclusive. Recently, it was reported that DNA:RNA hybrids derived from R-loops are transported to the cytoplasm and thus trigger an innate immune response [
49]. The relevance of this observation to impaired hematopoiesis caused by
DDX41 variants is of interest.
DDX41 is also reported to promote the processing of small nucleolar RNA (snoRNA) from introns [
37]. Some snoRNA are coded within introns of ribosomal protein genes and mature after being processed from the introns [
50,
51]. snoRNAs are classified into boxC/D type and boxH/ACA types depending on their sequences; the former catalyzes 2'-O-methylation and the latter is responsible for catalyzing pseudouridylation of uridine residues in ribosomal RNA, thereby promoting ribosomal biogenesis. Thus, loss of function or expression of
DDX41 impairs ribosomal biogenesis [
27,
52]. Although the involvement of DDX41 in ribosomal biogenesis has been reported by other research groups, the process involving DDX41 may be different from those involving snoRNA processing.
Recently, myeloid neoplasms with germline
DDX41 variants were shown to have a higher proportion of somatic
CUX1 variants compared with those without a known germline background [
20]. CUX1 is a transcription factor [
53] that has also been shown to be directly involved in DNA damage repair by recruiting histone-modifying enzymes to damaged DNA regions [
54]. Given that cells lacking sufficient CUX1 function can enter mitosis without completing DNA damage repair, the likelihood that loss of
DDX41 function or expression causes DNA replication stress would be further increased. However, further studies are clearly needed to fully elucidate the mechanisms by which
DDX41 variants lead to myeloid neoplasms.
- A.
A combination of germline and somatic DDX41 variants confers myeloid disease development.
Hematopoietic cells with a germline
DDX41 variant acquire a somatic
DDX41 variant at an advanced age. Myeloid neoplasms are thought to develop shortly after biallelic
DDX41 variant acquisition, with or without the addition of a limited number of somatic mutations in DNA repair-related genes, including
CUX1 and
TP53. It is also suggested that minor clones with biallelic
DDX41 variants affect hematopoiesis by interfering with other cells [
37].
- B.
R-loop formation and its consequence.
R-loop accumulation due to impaired RNA splicing or other causes increases DNA replication stress and innate immune response, resulting in deficient hematopoiesis and leukemogenesis.
2.2. TP53
TP53 is one of the most frequently mutated genes, especially in adult-onset cancers. Genome sequencing of various human cancer cells has revealed that 42% of cases carry
TP53 variants [
55]. The p53 protein is a transcription factor that can activate the expression of multiple target genes, plays an important role in the regulation of the cell cycle, apoptosis, and genomic stability, and is widely known as “the guardian of the genome” [
56,
57]. The evidence accumulated to date suggest that p53 also regulates cell metabolism, ferroptosis, tumor microenvironment, and autophagy, which each contribute to tumor suppression [
57]. Genomic instability caused by deletions and variants in
TP53 may lead to the accumulation of more oncogenes and promote tumorigenesis, growth, metastasis, and drug resistance [
58]. p53 variants confer metabolic plasticity to cancer cells, promoting adaptation to metabolic stress and increasing the possibility of proliferation and metastasis [
59].
The major type of
TP53 variant is a missense variant producing a single amino acid substitution, with the DNA-binding domain (DBD) being the most mutated region [
60]. Structural variants can reduce the thermostability of the protein, resulting in protein misfolding at physiological temperatures and loss of its ability to bind DNA [
61]. These variants not only bind wild-type p53 and cause dominant-negative (DN) effects, but may also be converted to oncogenic proteins via gain-of-function (GOF) [
62,
63]. p53 is mutated and inactivated in most malignancies, making it a very attractive target for the development of new anti-cancer drugs [
64]. Until recently, however, p53 was considered an undruggable target, and the progress made in p53-targeted therapeutics has been limited.
Li-Fraumeni syndrome (LFS) is caused by a germline variant in the
TP53 gene and is characterized by an increased risk of developing various solid tumors and hematologic malignancies at a young age [
65,
66]. LFS is inherited in an autosomal dominant manner, although de novo instances occur in 7–20% of cases. The tumor spectrum includes soft-tissue sarcomas, premenopausal breast cancer, central nervous system tumors, adrenocortical carcinomas, and pancreatic tumors, as well as MDS and lymphoid and myeloid malignancies. Germline
TP53 variants are found in approximately 50% of pediatric patients with hypoploid acute lymphoblastic leukemia (ALL) and are associated with poor outcomes [
67,
68]. In the Le-Fraumeni lineage, leukemia is relatively uncommon, with only approximately 4% of children and adolescents presenting with hypodiploid ALL, treatment-related, or de novo MDS/AML [
69].
Figure 2.
Role of p53 variants in cancer. p53 variants produce drug resistance, dominant negative effects on wild-type p53, proteasome repression, and loss of function of wild-type p53. In cases of gain of function (GOF), it promotes various cellular responses such as carcinogenesis, cancer cell proliferation, invasion, metastasis, tumor microenvironment establishment, genomic instability, and metabolic reprogramming.
Figure 2.
Role of p53 variants in cancer. p53 variants produce drug resistance, dominant negative effects on wild-type p53, proteasome repression, and loss of function of wild-type p53. In cases of gain of function (GOF), it promotes various cellular responses such as carcinogenesis, cancer cell proliferation, invasion, metastasis, tumor microenvironment establishment, genomic instability, and metabolic reprogramming.
2.3. CEBPA
The
CEBPA gene is located on chromosome 19q13.1 and gene variants are a common genetic alteration in AML. Patients present with de novo AML [French American-British (FAB) classification; AML M1, M2, and M4 subtypes] and a group of differentiation abnormalities [
70]. These germline variants are generally frameshift or nonsense variants near the amino terminus of the encoded protein; somatic variants in
CEBPA often occur in the other allele, leading to a biallelic variant in
CEBPA. This triggers the development of AML [
71].
CEBPA-associated familial AML is defined as the presence of heterozygous germline
CEBPA pathogenic variants in AML patients and/or in families with one or more AML patient. In contrast, sporadic
CEBPA-associated AML is defined as AML in which the
CEBPA pathogenic variant is identified in leukemic cells and not in non-leukemic cells [
72]. AML with germline
CEBPA variants generally occurs in an autosomal-dominant inheritance without preceding abnormal blood cell counts or myelodysplasia [
73]. Approximately 10% of
CEBPA-associated AMLs have been shown to carry germline
CEBPA variants [
2]. In contrast to the incomplete penetrance observed in other HHMS, germline
CEBPA variants cause AML with almost complete penetrance (lifetime risk estimated to be >80%) [
74]. In the majority of
CEBPA-associated familial AML, the age of onset appears to be earlier than in sporadic
CEBPA-associated AML [
72]. Onset usually occurs in the 20th or 30th year of life, and many patients develop AML before 50 years of age; the median age of onset for AML is 25 years [
75]. The prognosis of
CEBPA-associated familial AML appears to be better than that of sporadic
CEBPA-associated AML [
76,
77]. Patients with
CEBPA-associated familial AML with a cured initial presentation are at high risk of developing additional independent leukemic episodes in addition to the risk of relapse from a pre-existing clone; the clinical observation that AML patients with
CEBPA variants are more likely to develop a secondary leukemia despite their favorable prognosis is likely due to this pattern of progression [
78]. Lifelong surveillance is recommended in patients with familial AML because of the high risk of late leukemia relapse [
16]. It is important to avoid the use of allogeneic or consanguineous donors for HSCT without prior evaluation of the donor's germline
CEBPA pathogenic variant [
79].