Introduction
Amyloid fibrils are insoluble protein aggregates with a characteristic appearance under the electron microscope [
1]. These fibrils are best known as the hallmark of several human diseases, which are collectively known as amyloidosis [
2,
3]. Amyloid fibrils are most often homopolymers of a single precursor protein, held together in a cross-β structure by intermolecular hydrogen bonding between backbone amides. The precursor proteins for most human amyloid diseases are secreted from their cell of origin into circulation before aggregating in distal tissues. Amyloid precursors are often covalently modified after translation, and these post-translational modifications (PTMs) may be incorporated into the structure of the amyloid fibril [
4,
5]. Commonly observed modifications in secreted proteins include disulfide bonds, proteolytic cleavage and N-glycosylation [
6]. The effects of such modifications on the propensity of precursor protein to form amyloid fibrils is not well understood. Here, we examine the role of N-glycosylation, which is the covalent attachment of sugar molecules (glycans) to the amide side chains of asparagine residues, in amyloid light chain (AL) amyloidosis.
AL amyloidosis is caused by aggregation of antibody light chain (LC) proteins, or their fragments, as amyloid fibrils in multiple tissues [
7,
8,
9]. It is progressive and fatal if untreated. The precursor LCs are secreted by an aberrant clonal population of B cells, most commonly bone marrow plasma cells. These cells secrete a single LC sequence, referred to as monoclonal to distinguish it from the polyclonal immune repertoire. The LCs are subunits of mature antibodies, which have undergone V(D)J recombination and somatic hypermutation for antigen selection, so that each individual with AL amyloidosis has an essentially unique amyloid precursor protein [
8]. LCs may circulate as a complete antibody or as a free LC, without a heavy chain partner [
10]. Only a minority of clonal LCs cause clinically relevant amyloid deposition, whereas most are efficiently removed from circulation. In multiple myeloma (MM), monoclonal LCs circulate at elevated concentrations but generally do not cause symptomatic amyloid deposition [
11,
12]. Therefore, the unique sequence of each LC—including its PTMs—appears to determine whether it will form amyloid fibrils and lead to AL amyloidosis.
Humans have two LC isotypes, kappa (κ) and lambda (λ), one of which is selected within each B cell clone. The
IGK and
IGL genomic loci, which encode κ and λ LCs, respectively, comprise multiple variable (
IGVL), joining (
IGJL) and constant (
IGCL) precursor germline gene fragments that are recombined to produce a functional gene. LCs derived from a subset of precursor genes are observed in patients with AL amyloidosis more frequently than in the polyclonal repertoire or MM [
13,
14]. Over 60% of AL clones express a LC derived from the
IGKV1-33,
IGLV1-44,
IGLV2-14,
IGLV3-1 and
IGLV6-57 genes [
14,
15]. The precursor gene fragments that are used account for much of the sequence variation between LCs and can be thought of as reference sequences that are then further modified by somatic hypermutation. The protein sequence defines the 3-dimensional structure of the LC and determines which residues are co- or post-translationally N-glycosylated.
Antibody structures are formed from arrays of immunoglobulin (Ig) domains, comprising two ß-sheets linked by a disulfide bond. LCs have an N-terminal variable (VL) domain, involved in antigen recognition, and a C-terminal constant (CL) domain, that forms a tight interface with the heavy chain, including an inter-chain disulfide bond. VL-domains are encoded by germline IGVL-IGJL gene rearrangements. Here, we describe each LC as being derived from a specific IGVL gene, either IGKV or IGLV. Within VL-domains, three complementarity determining regions (CDR 1, 2 and 3) with highly diverse sequences form loops that define the antigen binding site of most antibodies. These loops are scaffolded by framework regions (FR 1, 2, 3 and 4), which are more conserved between antibodies. In a prototypic antibody, two heavy chains and two LCs assemble into a heterotetramer with two antigen binding arms (known as Fabs for “fragment antigen binding”) and an effector region (Fc, for “fragment crystallizable”). Each Fab comprises the heavy chain variable and constant 1 domains (VH and CH1) and the full-length LC. Free LCs may circulate as monomers or form homodimers, which can be stabilized by a disulfide bond between CL-domains.
Multiple studies have shown that AL amyloid fibrils are formed from LC-derived peptides in non-native cross-β conformations that have no structural homology with the native, β-sandwich immunoglobulin fold [
4,
5,
16,
17,
18,
19,
20]. Therefore, unfolding of the native state after secretion from plasma cells appears to be required for aggregation. LC sequence features may drive amyloid formation by disfavoring the native LC structure, relative to that of the fibril and unstructured intermediate states [
21]. Several examples of destabilizing residue changes in amyloidogenic, relative to non-amyloidogenic LCs, have been described [
22,
23,
24].
Although a role for destabilizing mutations in LC amyloidogenesis is well established, the effects of PTMs are less clear. Multiple modifications have been observed in LCs from patients, both within amyloid fibrils and in the circulating precursor LCs. Proteolytic cleavage of parts of the C
L-domain, which has a stabilizing effect on LCs [
25,
26], is almost ubiquitous, although it is not clear whether this occurs before or after aggregation [
25,
27,
28,
29,
30]. The internal disulfide bond that stabilizes all Ig domains is retained within the structure of the fibrils, although the orientation of the peptide chains is inverted around the disulfide [
17]. Cyclization of N-terminal glutamine residues to form pyroglutamate [
4] and oxidative modifications of residues including cysteine and methionine [
31,
32,
33] are also observed. N-glycosylation of asparagine residues, which has been observed in many AL-associated LCs [
32,
34,
35,
36,
37,
38], has gained a new prominence recently due to several studies showing that κ LCs from individuals with AL amyloidosis are more likely to be N-glycosylated than κ LCs associated with MM [
39,
40,
41,
42,
43]. Notably, two high-resolution structures of λ AL amyloid fibrils are N-glycosylated, with electron density for the glycan visible in the maps [
4,
5]. These observations have led to the hypothesis that N-glycosylation is a biomarker for potentially amyloidogenic LCs, which could aid early diagnosis. However, the reasons for the association of AL amyloidosis with glycosylated LCs are unknown.
All human antibodies are N-glycosylated in their Fc regions [
44]. In addition, around 10% of antibodies are N-glycosylated within their antigen-binding regions, apparently for functional reasons [
45,
46]. N-glycosylation is carried out co- or post-translationally by the oligosaccharyl transferase complex in the endoplasmic reticulum (ER) [
10,
47]. Most secreted or cell-surface proteins are N-glycosylated at asparagine residues within a short sequence motif or “sequon”, NxS/T, where N is asparagine, x is any residue other than proline and S/T is serine or threonine. Most sequons appear to be introduced into
IGVL-IGJL genes by somatic hypermutation [
40,
46].
Attachment of glycans alters the structure and function of glycoproteins [
48]. Sugars are hydrophilic and resistant to burial within the hydrophobic cores of proteins, so glycans generally decorate the surface of glycoproteins. Glycans that have been observed on amyloid fibril structures are also exposed to the solvent [
4,
5,
49]. Glycans prevent access to regions of the surface by some potential partners, while providing interaction sites for carbohydrate-binding proteins such as lectins [
6,
44]. Glycosylation may stabilize or destabilize the protein structure against unfolding, depending on the interactions that the glycan makes with surface residues [
50,
51,
52]. Glycans may also increase protein solubility, by preventing self-association of hydrophobic surfaces [
53]. Notably, natural N-glycosylation of antibody Fabs has been observed to increase thermal stability [
54]. Based on these features, it seems reasonable to hypothesize that N-glycosylation generally reduces the propensity of proteins to aggregate.
Why then, is LC N-glycosylation associated with AL amyloidosis? Here, to address this question, we compare the fraction of LCs containing N-glycosylation sequons and the distribution of sequons within LC sequences between AL amyloidosis, MM and the polyclonal repertoire. Monoclonal, disease-associated LCs are taken from AL-Base, our database of LC sequences associated with AL amyloidosis and other plasma cell dyscrasias (PCDs), which was recently updated [
15]. As an additional control, we examine polyclonal sequences from the Observed Antibody Space (OAS) resource [
55,
56]. This large number of sequences allows the association between AL amyloidosis and N-glycosylated LCs to be explored in more detail than has previously been possible.
Discussion
Our comparative analysis of monoclonal AL LCs, monoclonal MM LCs and polyclonal OAS LCs is consistent with previous observations that N-glycosylation sequons are significantly enriched among AL κ LCs, compared to non-AL κ LCs [
40,
41] (
Figure 1). For λ LCs, sequons occur at similar rates between AL and non-AL LCs. LCs harboring sequons have more mutations than other LCs (
Figure 2). Sequons are significantly more frequent (FDR ≤ 0.05) in AL LC than in OAS LCs among five precursor genes:
IGKV1-16,
IGKV1-33 and
IGKV1-39; and
IGLV1-51 and
IGLV2-14 (
Figure 3). The locations of sequons within LC sequences are similar within LCs of the same isotype, predominantly occurring at progenitor N-glycosylation sites where only a single nucleotide change is required to create an NxS/T sequon [
46] (
Figure 4 and
Figure 5). The residues around AL and MM sequons are similar, although AL LCs more frequently have threonine in the +2 position, consistent with efficient N-glycosylation [
70] (
Figure 6). Among AL LCs, sequons are invariably located at residues that would be exposed to the solvent in an isolated variable domain, but the majority of λ sequons occur in residues that are close to the interface with the LC partner in homodimers and the heavy chain in antibody Fab complexes (
Figure 7 and
Figure 8). No sequon positions are located on the surface of all seven known λ amyloid fibril structures (
Figure 9 and
Figure 10).
Glycosylation has been suggested as a biomarker for amyloidosis risk [
39,
41,
42]. It is possible to identify glycosylated monoclonal free LCs from blood or urine without determining their sequence, using biochemical analysis or mass spectrometry [
43,
84]. Such a test would not require LC sequence information and could be implemented using existing technology. However, although sequons are highly enriched among AL κ LCs, most AL amyloidosis clones express λ LCs (
Table 1). The fraction of κ LCs with a sequon is 67/684 (9.8%) in AL LCs and 62/969 (6.4%) in MM LCs. Because AL amyloidosis is much less prevalent than MM [
7], the presence of a glycan or sequon cannot, in itself, provide sufficient information to contribute to diagnosis. Further work is needed to understand the contexts in which N-glycosylation promotes amyloidosis.
The context of N-glycosylation may be one factor that determines its effect on LCs. Sequons are more frequent in AL LCs derived from a subset of precursor genes (
Figure 3 and
Figure 5). Therefore, the enrichment of sequons among AL κ LCs is not simply due to the over-representation of precursor genes which are frequently glycosylated. Of note, AL LCs derived from
IGLV6-57, which is the gene most strongly associated with AL amyloidosis, are significantly less likely to harbor sequons than
IGLV6-57-derived LCs from the polyclonal repertoire (
Figure 3C). Nevone and coworkers identified glycosylation within FR3 as a predictor of amyloidosis risk [
40]. Our data are consistent with this hypothesis. However, FR3 sequons are enriched in genes that are observed more frequently in AL amyloidosis than in MM (
Figure 5). Therefore, the over-representation of FR3 sequons is related to the over-representation of LCs which are frequently glycosylated at these positions. It is not clear whether frequent N-glycosylation on FR3 causes or a consequence of the association between
IGKV1 genes and AL amyloidosis.
N-glycosylation can increase protein stability and solubility, and burial of hydrophilic glycan moieties within the hydrophobic cores of proteins is thermodynamically unfavorable [
48]. The locations of sequons within LCs are not consistent with substantial disruption of the variable domain structure that would lead to increased unfolding-linked aggregation (
Figure 7 and
Figure 8). Although mutation to asparagine of a residue that would otherwise be buried within the context of a sequon is possible, the low frequency of these events is consistent with the hypothesis that such structures would not fold sufficiently well for surface expression on B cells and subsequent clonal selection. Glycosylation of CDR3 residues may disrupt the interface between the LC and its heavy chain partner sufficiently that less antibody, and more free light chain, is released into circulation. Moreover, CDR3 glycans may also disrupt the homodimerization of free LCs, since two glycans in close proximity may be energetically unfavorable; that is, the reduction in the available space for each glycan carries an entropic cost that could destabilize the homodimer. Dimerization of free LCs can be protective against aggregation, an effect that is exploited by small molecules that stabilize LCs against unfolding and proteolysis [
85,
86]. Therefore, disruption of such dimers could favor misfolding. We hypothesize that this disruption of LC homodimers by CDR3-linked glycans is a factor in amyloidosis of N-glycosylated λ LCs.
Misfolded glycoproteins are triaged by the calnexin/calreticulin chaperone system within the ER and subject to ER-associated degradation [
10]. N-glycosylation may allow unstable LCs that would otherwise misfold within the cell to be exported, where they can go on to aggregate elsewhere. This is analogous to the situation in ATTR amyloidosis, where highly destabilized transthyretin variants are not exported from the liver due to stringent quality control within hepatocytes [
87]. Secretion of these LCs would require additional cellular resources, both to synthesize the glycans and to allow their refolding by ATP-dependent chaperones within the ER and, when necessary, proteasomal degradation [
10]. Such resources may not be compatible with the more rapid proliferation of MM plasma cells, but may be tolerated by slow-growing AL plasma cells. This mechanism implies that glycosylated LCs are less stable than other LCs, such that they are exported less efficiently without their glycan, which could be tested experimentally. Glycosylated LCs may also impose more load on cells’ proteostasis machinery, potentially making them more sensitive to proteasome inhibition, which is part of the standard of care for AL amyloidosis treatment [
7].
Assuming that glycans must be solvated and displayed on the surface of fibrils, each sequon position is compatible with only a subset of known AL fibril structures (
Figure 9 and
Figure 10). We anticipate that when structures of glycosylated κ LCs are solved, they will form new amyloid folds. The presence of glycans on fibrils may promote or prevent interactions with other factors in order to enhance the stability of fibrils. For example, glycans may prevent access to fibrils by proteases, or serve as signals that deter phagocytic cells. It is possible that glycosylation could prevent binding of therapeutic antibodies that are undergoing clinical trials as amyloid-depleting therapies [
88], so fibrils’ glycosylation status could be an important marker for the use of such therapies.
The major limitation of this study is the relatively small number of PCD LC sequences that contain sequons, which leads to wide confidence intervals on the calculated odds ratios, reducing confidence in the analysis. This limitation is compounded by the uncertainty of which sequons are actually N-glycosylated
in vivo [
40,
89]. Like other studies of LCs, the classification of clinical cases as amyloidosis may underestimate the true frequency of amyloid deposition [
15]. A fraction of MM cases involves amyloid that may not be considered clinically significant [
11,
12]. Similarly, OAS LCs may be amyloidogenic in the context of a PCD. However, because AL amyloidosis is rare, it is reasonable to suppose that most LCs are resistant to amyloidogenesis, even as monoclonal proteins. Another limitation is that we only consider the V
L-domains of LCs, although N-glycosylation may also occur within the C
L-domain. Finally, our structural analysis relies on the correspondence of IMGT numbering with V
L-domain structure, which may not be true for all LCs.
In summary, our analysis has identified several previously unreported features of glycosylation among AL LCs. Although there is a strong association between glycosylation and AL for a subset of precursor genes, these LCs represent only a minority of AL amyloidosis cases and the presence of glycosylation should not be considered diagnostic for amyloidosis. Glycosylation is clearly neither necessary nor sufficient for amyloidosis. We suggest several testable hypotheses that might further explain how glycosylated LCs can lead to amyloid disease. We anticipate that characterization of more LCs, both as soluble proteins and amyloid fibrils, will allow these hypotheses to be tested in the near future.
Figure 1.
The fraction of sequences with NxS/T N-glycosylation sequons differs between AL amyloidosis, multiple myeloma (MM) and the polyclonal repertoire, represented by sequences from Observed Antibody Space (OAS). The proportion of LCs associated with AL or MM, or from the OAS repertoire, with or without an NxS/T sequon is shown in dark and light colors, respectively; κ LCs are blue and λ LCs are orange. Odds ratios (OR) for selected comparisons are shown. FDR, false discovery rate. A) All LCs, with ORs for the AL vs. MM and AL vs. OAS comparisons. B) LCs separated by isotype, with ORs for comparisons between κ groups.
Figure 1.
The fraction of sequences with NxS/T N-glycosylation sequons differs between AL amyloidosis, multiple myeloma (MM) and the polyclonal repertoire, represented by sequences from Observed Antibody Space (OAS). The proportion of LCs associated with AL or MM, or from the OAS repertoire, with or without an NxS/T sequon is shown in dark and light colors, respectively; κ LCs are blue and λ LCs are orange. Odds ratios (OR) for selected comparisons are shown. FDR, false discovery rate. A) All LCs, with ORs for the AL vs. MM and AL vs. OAS comparisons. B) LCs separated by isotype, with ORs for comparisons between κ groups.
Figure 2.
Glycosylation is associated with an increased number of somatic mutations. The number of amino acid residue substitutions, insertions and deletions in each LC VL-domain is shown as a percentage of its length. The box and whisker plots show median (central bars), inter-quartile range (boxes), distance to the non-outlier data (whiskers) and outlying points (circles). Blue and orange denote κ and λ LCs, respectively. Significance values, corrected for multiple testing, are shown for each comparison.
Figure 2.
Glycosylation is associated with an increased number of somatic mutations. The number of amino acid residue substitutions, insertions and deletions in each LC VL-domain is shown as a percentage of its length. The box and whisker plots show median (central bars), inter-quartile range (boxes), distance to the non-outlier data (whiskers) and outlying points (circles). Blue and orange denote κ and λ LCs, respectively. Significance values, corrected for multiple testing, are shown for each comparison.
Figure 3.
LCs derived from a subset of genes are significantly enriched in sequons. A) Numbers of LCs derived from each germline gene that harbor an NxS/T sequon. Data is shown for genes from which at least 10 AL LCs are derived. Counts are shown as a fraction of the total number of LCs derived from that gene. AL, MM and OAS LCs are shown in red, purple and green, respectively. A) The number of LCs associated with each gene for AL, MM and OAS LCs is shown as a fraction of the total number of sequences derived from that gene. B) Fractions of LCs derived from each germline gene that harbor an NxS/T sequon. C) Odds ratios and 95% confidence intervals for the relative frequency of sequons among LCs from different origins where a significant difference was observed (FDR ≤ 0.05). Blue and orange symbols show κ and λ LCs, respectively.
Figure 3.
LCs derived from a subset of genes are significantly enriched in sequons. A) Numbers of LCs derived from each germline gene that harbor an NxS/T sequon. Data is shown for genes from which at least 10 AL LCs are derived. Counts are shown as a fraction of the total number of LCs derived from that gene. AL, MM and OAS LCs are shown in red, purple and green, respectively. A) The number of LCs associated with each gene for AL, MM and OAS LCs is shown as a fraction of the total number of sequences derived from that gene. B) Fractions of LCs derived from each germline gene that harbor an NxS/T sequon. C) Odds ratios and 95% confidence intervals for the relative frequency of sequons among LCs from different origins where a significant difference was observed (FDR ≤ 0.05). Blue and orange symbols show κ and λ LCs, respectively.
Figure 4.
Positions of sequons within VL-domains. The number of sequons observed at each IMGT position is shown for κ and λ LCs associated with AL and MM, and from the polyclonal OAS repertoire. Orange and black bars show positions where a sequon progenitor is present and absent, respectively, in the assigned germline gene. Yellow and blue lines along the x-axes represent FR and CDR positions, respectively. Positions of sequons in AL and MM LCs are highlighted.
Figure 4.
Positions of sequons within VL-domains. The number of sequons observed at each IMGT position is shown for κ and λ LCs associated with AL and MM, and from the polyclonal OAS repertoire. Orange and black bars show positions where a sequon progenitor is present and absent, respectively, in the assigned germline gene. Yellow and blue lines along the x-axes represent FR and CDR positions, respectively. Positions of sequons in AL and MM LCs are highlighted.
Figure 5.
Location of sequons within LCs among IMGT-defined structural elements. AL, MM and OAS LCs are shown in red, purple and green, respectively. Yellow and blue lines along the x-axes represent FR and CDR positions, respectively.
Figure 5.
Location of sequons within LCs among IMGT-defined structural elements. AL, MM and OAS LCs are shown in red, purple and green, respectively. Yellow and blue lines along the x-axes represent FR and CDR positions, respectively.
Figure 6.
Residues around the sequon asparagine are similar between AL and MM LCs. Sequence logos showing the proportion of each residue observed around each sequon. Numbering is relative to the asparagine residue. Colors show the chemical properties of each residue.
Figure 6.
Residues around the sequon asparagine are similar between AL and MM LCs. Sequence logos showing the proportion of each residue observed around each sequon. Numbering is relative to the asparagine residue. Colors show the chemical properties of each residue.
Figure 7.
Most sequon positions are compatible with glycosylation. The number of sequons observed at each position is shown by bars, as for
Figure 4. Solvent exposure is shown by the color of the bars and by squares under the plots. IMGT positions corresponding to residues on the surface of V
L-domains, buried in the VL-domain core, and at the interface with heavy chains are shown in grey, red and blue, respectively. Solvent exposure was determined based on the structures of the isolated germline V
L-domains for
IGKV1-33 (PDB entry 2Q20) and
IGLV2-14 (PDB entry 6MS1), which were used to represent κ and λ LCs, respectively. Hollow bars and gaps represent IMGT positions that are not represented in the structures. Yellow and blue lines along the x-axes represent FR and CDR positions, respectively. Arrows and cylinders below the axes show the positions of ß-sheets and α-helices defined in the structures.
Figure 7.
Most sequon positions are compatible with glycosylation. The number of sequons observed at each position is shown by bars, as for
Figure 4. Solvent exposure is shown by the color of the bars and by squares under the plots. IMGT positions corresponding to residues on the surface of V
L-domains, buried in the VL-domain core, and at the interface with heavy chains are shown in grey, red and blue, respectively. Solvent exposure was determined based on the structures of the isolated germline V
L-domains for
IGKV1-33 (PDB entry 2Q20) and
IGLV2-14 (PDB entry 6MS1), which were used to represent κ and λ LCs, respectively. Hollow bars and gaps represent IMGT positions that are not represented in the structures. Yellow and blue lines along the x-axes represent FR and CDR positions, respectively. Arrows and cylinders below the axes show the positions of ß-sheets and α-helices defined in the structures.
Figure 8.
AL sequon positions mapped to LC structures in VL-domain homodimers (A, B) and antibody Fab complexes (C, D). Complexes are oriented so that the LC CDR3 residues (indicated on the figures as CDR3) are at the top and center, and the C-terminal residues, which connect to the CL-domain, are at the bottom left. One LC is shown as a backbone trace, with the Cα positions where sequons are observed shown as spheres. Residues are colored according to solvent exposure: grey, surface-exposed; red, buried in the core; blue, interacting with the partner chain. The second LC of the dimer, and the heavy chain of each Fab is shown in surface representation.
Figure 8.
AL sequon positions mapped to LC structures in VL-domain homodimers (A, B) and antibody Fab complexes (C, D). Complexes are oriented so that the LC CDR3 residues (indicated on the figures as CDR3) are at the top and center, and the C-terminal residues, which connect to the CL-domain, are at the bottom left. One LC is shown as a backbone trace, with the Cα positions where sequons are observed shown as spheres. Residues are colored according to solvent exposure: grey, surface-exposed; red, buried in the core; blue, interacting with the partner chain. The second LC of the dimer, and the heavy chain of each Fab is shown in surface representation.
Figure 9.
Positions of sequons observed in κ AL LCs, mapped onto the published structures of AL λ amyloid fibrils. The position of the glycans in the fibril structures of 7NSL and 9FAA are shown with a triangle. All structures are oriented so that cysteine 23 is on the upper side of the disulfide and cysteine 104 is on the lower side. Dashed lines indicate missing density from the structures. Residue positions where sequons asparagine residues are observed in AL κ LCs are shown as spheres, colored to indicate the number of sequons at each position (data from
Figure 4 and
Figure 7). Note that not all IMGT positions are occupied in each structure, so not all the sequon positions can be shown.
Figure 9.
Positions of sequons observed in κ AL LCs, mapped onto the published structures of AL λ amyloid fibrils. The position of the glycans in the fibril structures of 7NSL and 9FAA are shown with a triangle. All structures are oriented so that cysteine 23 is on the upper side of the disulfide and cysteine 104 is on the lower side. Dashed lines indicate missing density from the structures. Residue positions where sequons asparagine residues are observed in AL κ LCs are shown as spheres, colored to indicate the number of sequons at each position (data from
Figure 4 and
Figure 7). Note that not all IMGT positions are occupied in each structure, so not all the sequon positions can be shown.
Figure 10.
Positions of sequons observed in λ AL LCs, mapped onto the published structures of AL λ amyloid fibrils. The position of the glycans in the fibril structures of 7NSL and 9FAA are shown with a triangle. All structures are oriented so that cysteine 23 is on the upper side of the disulfide and cysteine 104 is on the lower side. Dashed lines indicate missing density from the structures. Residue positions where sequons asparagine residues are observed in AL λ LCs are shown as spheres, colored to indicate the number of sequons at each position (data from
Figure 4 and
Figure 7). Note that not all IMGT positions are occupied in each structure, so not all the sequon positions can be shown.
Figure 10.
Positions of sequons observed in λ AL LCs, mapped onto the published structures of AL λ amyloid fibrils. The position of the glycans in the fibril structures of 7NSL and 9FAA are shown with a triangle. All structures are oriented so that cysteine 23 is on the upper side of the disulfide and cysteine 104 is on the lower side. Dashed lines indicate missing density from the structures. Residue positions where sequons asparagine residues are observed in AL λ LCs are shown as spheres, colored to indicate the number of sequons at each position (data from
Figure 4 and
Figure 7). Note that not all IMGT positions are occupied in each structure, so not all the sequon positions can be shown.
Table 1.
Numbers of LC sequences analyzed. Monoclonal LCs are from the AL-Base AL and MM subcategories [
15].
Table 1.
Numbers of LC sequences analyzed. Monoclonal LCs are from the AL-Base AL and MM subcategories [
15].
Sequence origin |
IGKV |
IGLV |
Total |
Total |
Sequon |
No sequon |
Total |
Sequon |
No sequon |
Total |
Sequon |
No sequon |
AL subcategory |
160 |
67 41.9% |
93 58.1% |
524 |
39 7.4% |
485 92.6% |
684 |
106 15.5% |
578 84.5% |
MM subcategory |
595 |
62 10.4% |
533 89.6% |
374 |
38 10.2% |
336 89.8% |
969 |
100 10.3% |
869 89.7% |
OAS repertoire |
4,278,425 |
360,785 8.4% |
3,917,640 91.6% |
3,769,322 |
271,185 7.2% |
3,498,137 92.8% |
8,047,747 |
631,970 0.79% |
7,415,777 92.1% |
Table 2.
Cysteine sequons (NxC) in AL and MM LCs.
Table 2.
Cysteine sequons (NxC) in AL and MM LCs.
AL-Base subcategory |
IGVL gene |
Region |
Asn position (IMGT) |
Sequence |
Number of sequences |
AL |
IGLV2-23 |
CDR3 |
114 |
NTC |
1 |
AL |
IGLV3-1 |
CDR1 |
38 |
NAC |
2 |
AL |
IGLV3-1 |
CDR1 |
38 |
NVC |
2 |
MM |
IGKV1-39 |
CDR1 |
36 |
NTC |
1 |