1. Introduction
The Systemic Lupus Erythematosus (lupus) is a debilitating autoimmune disease with widely varying clinical manifestations, affecting an estimated 3.41 million people [
1,
2]. The pathology of lupus is perpetrated in part by autoreactive B-cells, which produce auto-antibodies against self-antigens such as DNA, nuclear proteins, and other damage-associated molecular patterns (DAMPs) that remain when cells are damaged or undergo apoptosis [
3]. Lupus has a complex diagnostic process, and many patients wait for years for a firm diagnosis. Lupus also has a substantial impact on the daily life of patients and their families due to severe symptoms (kidney failure, severe rash, depression, chronic fatigue, chronic pain, etc.) and limited treatment options, sometimes with undesirable side effects. Hydroxychloroquine, glucocorticosteroids, immunosuppressive drugs are effective at attenuating lupus symptoms and have been the standard of treatment for decades, with the recent addition of biologics such as belimumab. Hydroxychloroquine is still considered the cornerstone of most lupus treatment, despite the long-term risk of retinal toxicity, which can contribute to patient blindness [
4]. Glucocorticoids are impressively effective at treating lupus symptoms in the short-term, but can cause permanent damage to multiple organ systems, cataracts, osteoporosis, and coronary artery disease after long-term use or at higher than standard dosing [
5,
6,
7].
Lymphomas are estimated to make up 5% of malignancies worldwide [
8], with ~450,000 new cases of Non-Hodgkin’s Lymphoma annually, leading to 240,000 deaths per year [
9]. In addition, up to 90% of Non-Hodgkin’s lymphomas originate from B-cells [
10]. Many B-cell Non-Hodgkin’s lymphomas (lymphomas), including diffuse large B-cell lymphoma, are aggressive and difficult to treat without severe side effects [
8]. The slower-moving indolent lymphomas are typically considered incurable, and while patients are subjected to milder treatments such as rituximab-bendamustine (R- Benda), they run the long-term risk of cumulative toxicity from repeated chemotherapies and tumor progression into an aggressive lymphoma subtype [
8]. The first-line therapy for fast-moving lymphomas is aggressive combination chemotherapy, such as R-CHOP (rituximab, cyclophosphamide, vincristine, and prednisone). In the case of relapse, the first-line treatments are followed by anti-CD19 chimeric antigen receptor (CAR) T-cell therapy or more intense chemotherapy regimes, which all require in-patient hospital stays, such as R-ICE (rituximab, ifosfamide, carboplatin, etoposide) or high dose chemotherapy with autologous stem cell rescue (HD-SCT) regimens such as the Nordic protocol (administration of maxi R–CHOP alternating with rituximab and high-dose cytarabine) [
8]. Though some sources estimate lymphoma patient survival to be as much as 72%, the price paid to reach survival takes a heavy toll on the body and life quality of the patient as they deal with rampant toxicity from chemotherapies or B-cell aplasia and immune compromise from currently-approved anti-CD19 CAR T-cell therapies [
8,
11].
Patients with lupus are seven times more likely to develop B-cell lymphomas than healthy patients, suggesting a possible mechanistic relationship between the two diseases [
12]. An important aspect of both lupus and lymphoma progression and of their relationship is immune involvement. The immune system is generally responsible for tolerating self-tissue, for removing non-self threats, including cancers, bacteria, and viruses, and for healing damaged self-tissue upon clearance of non-self threats. Healthy immune function begins and ends with a non-inflamed homeostasis which could be termed ‘immune balance’ [
13,
14]. Immune activation against threats is primarily accomplished via inflammatory signaling (including but not limited to eicosanoids including B- and E-prostaglandins and leukotrienes, vasoactive amines, complement cascade, kinins, and depending on the circumstances, cytokines such as TNFA, IFNG, IL1, IL2, IL6, IL12, IL17, IL18, and IL23) [
15], while tolerance of self-tissue and wound healing/cleanup are accomplished via anti-inflammatory/pro-resolving signaling (including but not limited to nitric oxide, adenosine, cortisol, steroids, eicosanoids including lipoxins, resolvins, and D-prostaglandins, protectins, IL10, and TGFB) [
16]. Conditions such as cancer or autoimmune diseases develop when this immune balance is disrupted by a failure of immune activation (cancer) or a failure of immune tolerance and resolution (autoimmune disease). Intense inflammation associated with immune activation from several causes (infection, allergy, trauma, etc.) can set the stage for autoimmune development [
17,
18]. Failure to remove inflammation-inducing, damage-associated molecular patterns (DAMPs) and/or pathogen-associated molecular patterns (PAMPs) can lead to the loss of self-tolerance [
19]. In particular, pathogen molecular mimicry and the creation of cross-reactive and/or DAMP-targeted antibodies (e.g., anti-DNA) are known mechanisms of autoimmune disease development [
17,
18]. The failure of immune tolerance is integral to autoimmune development. In contrast, the inability of the immune system to effectively identify tumor cells can result in tolerance that enables the growth and progression of cancer [
20]. Cancer develops as tumor cells create subclones and strive to avoid immune surveillance via immune-silencing strategies such as PD1/PDL1 blockade [
21,
22]. Cancer development is dependent upon the high immune tolerance of the tumor, which develops via natural selection as non-immune-avoidant cancer cells are killed and successful immune-evading cancer subclones proliferate [
20,
22]. The failure of sustained immune activation is integral to cancer development. Though there is some contrast suggested between lupus and lymphoma, lupus and lymphoma depend on immune function going awry pathologically. This similarity may contribute to the co-occurence of lupus and lymphoma.
Despite the higher-than-expected rate of lupus and lymphoma co-occurrence, to our knowledge, minimal studies have compared and contrasted lupus and lymphoma gene expression using RNA-sequencing. Additionally, there are no datasets in the NCBI gene expression omnibus (GEO) database that include bulk or single-cell RNA sequencing (RNA-seq) samples from patients with both conditions in the same study [
23].
Given the lack of an available dataset containing lupus, lymphoma, and healthy B-cell RNA-seq data, the current study aimed to conduct a survey of existing gene expression data from publicly available datasets in an attempt to understand better the mechanistic transcriptional similarities and differences between these two conditions. We then used these RNA-seq results as a use-case for our novel Immune Imbalance Transcriptomics (IIT) algorithm. This algorithm aims to provide a reproducible workflow to determine the top gene candidates that best represent the immune imbalance for a consistent tissue type (e.g. B-cells). We anticipate that identifying the shared and contrasting underlying mechanisms between these two contrasting B-cell related diseases can be used to determine the relevant mechanisms that contribute to immune imbalance and can therefore be used in the identification of candidate therapeutics that can be developed or repurposed to target these diseases.
3. Results
We began by identifying the appropriate FASTQ files from relevant studies (e.g., cancer and autoimmune disease samples each compared to healthy controls) that we would process with our IIT algorithm to predict targetable gene products in the relevant tissue (
Figure 1, Supplementary File S3). To do so, we searched the GEO database for all eligible lupus and lymphoma samples, healthy B-cell control samples originating from the lupus and lymphoma studies, and additionally gleaned healthy B-cell controls from other human RNA-sequencing studies (
Table 1, Supplementary File S1). During the screening, we excluded non-human samples, non-primary samples, non-B-cell samples, and other criteria that would negatively bias or affect our results. Our final dataset included 361 human patient samples, with 136 lymphoma samples, 138 lupus samples, and 88 healthy B-cell samples, from 16 studies [
25,
26,
27,
28,
29,
30,
31,
32,
33,
34,
35,
36,
37,
38,
39].
After identifying the FASTQ files for the relevant human samples, the sequencing reads were trimmed, mapped, and quantified against the human reference transcriptome. Transcript-level read counts were aggregated to calculate gene read counts. We applied a z-score normalization to the gene read count table (by gene) to avoid amplifying the impact of genes with high read counts and then ran a principal component analysis (PCA) to observe the relationship between samples and evaluate their similarity (
Figure 2). We found that the samples from different disease statuses (healthy, lupus, or lymphoma patients) generally clustered together (except one lymphoma sample), and with some overlap between the phenotypic groups (
Figure 2A). We next colored samples by study of origin and observed that the samples from each disease/control group were interspersed from multiple sequencing projects, suggesting a degree of similarity among samples from the same phenotypic groups, with relatively few batch effects among studies or phenotypes (
Figure 2B-D,
Supplementary Figure S1). These files were then assessed for differential gene expression using the Automated Reproducible Modular workflow for preprocessing and differential analysis of RNA-seq data (ARMOR) [
44,
45], making two comparisons: lupus vs. healthy and lymphoma vs. healthy. We found ~12,000 genes that displayed significant differential expression in the lupus vs. healthy analysis (Supplementary File S5), and ~13,000 significant genes in the lymphoma vs. healthy analysis (Supplementary File S6).
After computing the DEGs for the two comparisons (lupus vs. healthy and lymphoma vs. healthy), we then applied the IIT algorithm to find genes that have statistically significant differential expression in both comparisons. Briefly, the IIT algorithm includes the following steps: 1) for each of the input DEG sets, the log
2 fold-change values (log
2FCs) are multiplied by their corresponding -log
10FDR-adjusted p-values to weight the expression of each gene by its statistical significance. Next, the weighted scores for each comparison are z-score normalized prior to taking the square root of the squared normalized value (to eliminate negative values and reduce outliers). These intermediate scores, which still included genes expressing opposite directions, are then summed and normalized using the smaller ratio of either the intermediate score of the lupus vs. healthy comparison or the intermediate score of the lymphoma vs. healthy comparison to generate a final immune imbalance score. To determine the significance of our results and minimize potential batch effects, we calculated a null distribution for these final immune imbalance scores by performing 1,000 permutations of the original DEG analysis (Supplementary Files S7-S10), by randomly assigning samples to disease groups in each permutation and calculating IIT values for all genes, resulting in 18,468,000 permuted IIT final scores (Supplementary File S11). This null distribution aids with controlling for batch effects by assigning statistical significance only to IIT scores which have a strong enough signal to rise above the statistical noise. We then z-score normalized the IIT null distribution to assign significant p-values (p < 0.05) to the highest IIT scores on the upper tail of the null distribution, where the most intensely immune-imbalanced results should be located. We then performed multiple hypothesis testing corrections using the stringent Bonferroni method. Following the assignment of IIT score p-values based on the null distribution, we conducted one additional test to remove noise. Using the ~18 million data points in the null distribution, we tallied up how many times a gene was considered IIT significant in one of the 1,000 data permutations to remove genes with abundant frequency counts from the IIT-significant portion of the null distribution. When visualizing frequency counts for all genes in the IIT-significant portion of the null distribution, we found that the average gene frequency count was 205, meaning that the average gene garnered a significant IIT score in 205 permutations (
Figure 3A). We performed z-score normalization on the gene frequency counts, then calculated the p-value indicating which genes occurred at significantly abundant frequency counts and therefore had significantly high noise. We removed the genes above this frequency gate from further analysis to prevent noise from batch effects.
We then assessed the DEGs from the lupus vs. healthy and lymphoma vs. healthy differential expression analyses using the Immune Imbalance Transcriptomics (IIT) algorithm. Out of the initial overlapping significantly differentially expressed pool of 8,562 genes, we observed that 5,507 had statistically significant immune imbalance (Bonferroni p-value < 0.05; Supplementary File S12). Following application of the frequency gate as described above, 5,335 significant genes were left for consideration (Supplementary File S13). When we visualized overlapping gene results from the lupus and lymphoma differential gene comparisons using log
2FCs as the axis scales and one scatter-point per gene, we observed that the overlapping genes fall into four quadrants, with the significantly immune-imbalanced genes found by IIT localized to the outer edges (representing higher log
2FCs) and non-significant IIT genes near the center (representing lower log
2FCs), with genes filtered out by the frequency gate scattered throughout (
Figure 3B).
Functionally, these gene profiles can be categorized into four quadrants. For clarity, we label these quadrants as “cancer” (lymphoma in this use-case) and “autoimmune disease” (lupus in this use-case), knowing that this algorithm can be applied to different cancer-autoimmune pairings. The four quadrants are as follows: upregulated in both lymphoma (Cancer) and lupus (Autoimmune disease; Quadrant I, C+A+), downregulated in both lymphoma and lupus (Quadrant III, C-A-), upregulated in lymphoma but downregulated in lupus (Quadrant II, C+A-), and downregulated in lymphoma but upregulated in lupus as compared to healthy (Quadrant IV, C-A+). The top four immune-imbalanced genes from each quadrant are represented in
Table 2. Among the top four IIT genes per quadrant are several gene products that have previously been shown in other studies to be involved in lupus and/or lymphoma, including C1QB (C+A+ IIT quadrant) [
24,
68,
69,
70], IFI27 (C+A+ IIT quadrant) [
71,
72,
73,
74], and TSC22D3 (C-A+ IIT quadrant) [
75].
To identify the relevant functional terms and signaling cascades in the significant IIT results, we ran separate Gene Ontology (GO) and intracellular pathway hypergeometric enrichment analyses on the significant IIT genes using the EnrichR package [
54]. We retrieved all GO terms individually for significant IIT genes, compiled them for easy reference (Supplementary File S14), and utilized a text search to determine how many GO term descriptions mentioned “immune” or “immunity” per gene. We ran GO term enrichments on the genes in each quadrant (
Figure 3A-C, Supplementary File S15), and an additional hypergeometric enrichment on the entire significant IIT gene set (
Table 3, Supplementary File S15). We were somewhat surprised to observe relatively few overlapping terms between quadrants in these GO enrichment results (
Figure 3A-C), suggesting that genes in different quadrants potentially play unique functional roles. Interestingly, we also found many quadrant-specific results overlapping with the GO enrichment terms obtained from enriching all significant IIT genes together (
Figure 3D). Most quadrant-specific GO enrichments contained more results than the total IIT gene set analysis, potentially suggesting stronger signal and lower noise in quadrant-specific genes (
Figure 3D).
We found 659 significantly enriched signaling pathways from the four databases used in the analysis (
Table 4, Supplementary File S16,
Supplementary Figure S2). We noted four pathways with immune-related function among the top 10 significantly enriched, including “Neutrophil Degranulation,” “Immune System,” “Adaptive Immune System,” and “Innate Immune System.” The presence of these significant immune-related terms among our top results of significantly enriched pathways involved in immune function supports our hypothesis that our IIT algorithm can isolate genes relevant to immune function. It also introduces the possibility that other significant IIT genes with no currently known involvement in immune effector function and immune signaling may play uncharacterized roles in the immune function of B-cells.
We also observed multiple pathways relating to cell cycle regulation and cell division in our top ten enriched pathway results, including control of cell cycle by Rho GTPases such as “Signaling By Rho GTPases” and “Signaling By Rho GTPases, Miro GTPases And RHOBTB3”; as well as pathways involved in increasing transcription and translation such as “Eukaryotic Translation Elongation” and “Peptide Chain Elongation”, suggesting heightened activity that could be attributed to cell division. These findings augment our confidence in these results due to the well-known high rate of cell division in lupus auto-reactive B-cells and the high rate of aggressive division in B-cell lymphoma. This serves as an internal control that the IIT algorithm finds biologically relevant genes to B-cell function in the context of lupus and lymphoma pathology.
Following the pathway enrichment analysis, we analyzed the list of significant immune-imbalanced genes to identify therapeutic targets for potential drug repurposing. Using the OpenTargets database [
67] and an adaptation of the Pathway2Targets algorithm [
76], we predicted targets that could be utilized against lupus and lymphoma. A promising outcome of our target prediction analysis is that we found 389 (out of 5,335 significant immune-imbalanced genes) with known drugs that are either approved or in development for at least one indication (
Table 5, Supplementary File S17-S18). Lupus currently has 172 targets with known drugs, 40 of which are significantly immune-imbalanced in this study. Lymphoma currently has 437 targets with drugs in testing, and 151 of these targets appear on our significantly immune-imbalanced list. These known lupus and lymphoma drug targets in our significant immune-imbalanced results grant further veracity to the biological relevance of the IIT gene results to lupus and lymphoma treatment. Additionally, we found 349 novel targets for lupus and 238 novel targets for lymphoma using IIT, which may be useful in the future.
4. Discussion
In this work, we identified ~13,000 genes differentially expressed between lymphoma and healthy B-cells and ~12,000 genes between lupus B-cells and healthy B-cells, and the top results are biologically relevant to the pathology of these B-cell diseases. We used the DEG lists from these two comparisons as a use case to demonstrate the ability to identify robust and biologically relevant mechanisms in lupus and lymphoma using the significant 5,335 gene products identified by the null distribution within our novel Immune Imbalance Transcriptomics (IIT) algorithm, as well as biologically relevant enrichment results and potential therapeutic targets Previous bioinformatic approaches have investigated the shared transcriptomic mechanisms of disease between an autoimmune disease and a cancer from the same tissue type [
77], but have not accounted for potential contrasting mechanisms, a novel attribute of IIT.
More and more evidence shows that cancer and autoimmune diseases can have a causal or correlative relationship, with the development of one creating a greater risk for the development of the other in the same patient [
19,
77,
78,
79,
80,
81,
82,
83,
84,
85]. The rise in co-occurrence risk suggests that the two diseases likely share some underlying causes and common mechanisms, such as chronic inflammation [
85]. Though inflammation is classically activated by infection or injury, other causes exist, including underlying tissue malfunction, which induces para-inflammation as the body attempts to return to homeostasis [
86]. Para-inflammation has previously been implicated as the cause of chronic inflammatory conditions, including cancer and autoimmune disease [
86]. The rising occurrence of chronic inflammatory diseases and their co-incidence justifies a new research approach: finding, characterizing, and modifying the shared underlying disease pathologic mechanisms.
Among novel disease target candidates for antagonists in both lupus and lymphoma patient treatments are Complement C1qB Chain (C1QB, C+A+ quadrant), Interferon Alpha Inducible Protein 27 (IFI27, C+A+ quadrant), Insulin Like Growth Factor Binding Protein 2 (IGFBP2, C+A+ quadrant), and Transcobalamin 2 (TCN2, C+A+ quadrant). Each is a good candidate for knock-down treatment in pathogenic lupus and lymphoma cells due to their known involvement in lupus and lymphoma progression via overexpression, although care should be taken to minimize adverse events with these potential targets. Increased C1QB expression decreases the effectiveness of combined Ibrutinib and Venetoclax in patients with Mantle-Cell lymphoma (a B-cell lymphoma) and is associated with a worse prognosis [
68]. High IFI27 expression has been shown to drive cell proliferation and to spark B-cell lymphomagenesis [
72] and is a biomarker of B-cell cancer progression [
73]. IFI27 is part of the Interferon-I signature assessment used to determine SLE subtype during the process of lupus diagnosis [
87] and is a known SLE biomarker [
74]. Additionally, targeting and neutralizing IGFBP2 (C+A+ quadrant) as a lupus nephritis biomarker shows potential as a new option in treating patients with lupus nephritis, an advanced complication of SLE targeting the kidneys with few workable treatments [
88,
89]. TCN2 may be a predictive biomarker in primary large B-cell lymphoma of immune-privileged sites, making it a good early target to prevent metastasis [
90].
A subset of potential agonistic targets could benefit patients due to their significant downregulation in both lupus and lymphoma B-cells. However, very little is known about many of these significant IIT genes (located in Quadrant III) in the context of lupus and lymphoma B-cells. OTUD1 has been shown to repress type-I interferon-mediated disease, which includes lupus, in vivo [
91] and to have loss-of-function in SLE [
92], suggesting the promise of treatments that increase the activity of OTUD1 in lupus B-cells. Transmembrane Anterior Posterior Transformation 1 (TAPT1, C-A- quadrant), and Proline Rich Nuclear Receptor Coactivator 1 (PNRC1, C-A- quadrant) are novel results which, as underlying mechanistic genes shared by lupus and lymphoma, could be leveraged as possible therapeutic targets.
Perhaps more interesting is the potential application of contrasting immune imbalance mechanisms. It is well-known that tumors can be treated by inducing cancer-specific autoimmunity [
93,
94], suggesting that, to some degree, cancers and autoimmune diseases can be regulated by contrasting mechanisms. This contrast is seen by the apparent opposite utilization of the two complementary aspects of the immune system: immune activation (autoimmunity) and immune tolerance (cancer) [
13,
14]. Appropriate immune homeostasis, which is characterized by proper immune activation and subsequent resolution, is not present in either autoimmune disease (failure to tolerate self and resolve inflammation) or cancer (failure to activate sufficiently to detect and clear non-self tumor) [
15,
16]. Some of the most apparent examples of the “opposite mechanisms” at play in cancer and autoimmune disease involve immunotherapies (steroids, T-cell therapies, checkpoint inhibitors, other monoclonal antibody therapies, oncolytic viruses, etc.) [
77,
95]. The growing utilization of immunotherapies in cancer has revealed a fascinating conundrum: the rare occurrence of immune-related adverse events, specifically, loss of self-tolerance leading to autoimmunity [
77,
95]. The occurrence of cancer immunotherapy-related adverse immune events, such as cytokine storms (and other immune-related inflammatory phenomena) or induction of tissue-specific or systemic autoimmunity, point to a new and largely untouched area of research: the relationship between cancer and autoimmunity and what factors determine whether adverse immune events develop in a patient treated with immunotherapy [
96]. Conversely, some patients do not respond strongly enough to cancer immunotherapies to clear or inhibit tumor growth [
97]. In autoimmunity, anti-inflammatory immunotherapies can potently improve patient quality of life, but increases the risk of infection. The immunotherapeutic silencing of immune activation can also contribute to the development of cancer in autoimmune patients who are treated with anti-inflammatory drugs for long periods [
7]. Though successful immunotherapies and drugs affecting immune function are seminal achievements of substantial research efforts in the fields of cancer and autoimmunity, the failure of some patients to return to healthy immune homeostasis demonstrates that there is still room for improvement.
A potential alternative to the current immunotherapeutic treatments is harnessing the power of contrasting immune imbalance mechanisms, such as genes described in this study, to pull patients toward a healthy center. A gene upregulated in lupus B-cells may provide clues about how to modulate the immune microenvironment of B-cell lymphomas; likewise, an immuno-modulatory gene upregulated by B-cell lymphoma could possess the power to dampen the rampant inflammation of lupus. Many experimental methods for in vivo activation or inhibition of gene expression are currently under study and consideration, including gene delivery by engineered viruses [
98] and CRISPR methods (e.g., CRISPRa, CRISPRI, etc.) in pathogenic cells to increase or decrease gene expression [
99]. A current limitation in the field of in vivo gene editing is that it’s difficult to target the right cells without off-target effects. However, we anticipate that with further development, such approaches could enable in vivo gene therapy to treat immune-related diseases including cancer and autoimmunity using immune-imbalanced targets. Good examples of a significantly immune-imbalanced gene already in use as a drug target for both lupus and lymphoma are HRH1 (Quad I), JAK1 (Quad III), and PTGS1 (Quad IV).
Several immune imbalance genes in this analysis are good candidates for therapeutic agonists. Decreased expression of TPM2 (C+A- quadrant) in childhood-onset SLE contributes to musculoskeletal system symptoms [
100], suggesting potential benefits of an agonist. TSC22 Domain Family Member 3 (TSC22D3, C-A+ quadrant) is a proapoptotic gene in some cell types and is positively regulated by Estrogen-Related Receptor-Beta (ESRRB) [
75]. Previous studies have shown that ESRRB repression in acute lymphoblastic leukemias, which are closely related to B-cell lymphomas, contributes to treatment resistance. This suggests that activating TSC22D3 would improve the success of malignant B-cell treatment [
75]. KLF6 (C-A+ quadrant) is a key part of recruiting macrophages to the inflammation site and the immune response subsequent success, which is lacking in cancers, including B-cell lymphomas [
101]. These promising targets for gene-upregulation treatment present new possibilities for patients.
Additionally, many of the top immune-imbalanced genes could be potential targets for antagonists. Clinical analyses showed that tumors expressing increased levels of NFE2-Like bZIP Transcription Factor 3 (NFE2L3, C+A- quadrant) indicated a worse prognosis [
102], making it an ideal candidate for future knockdown studies. It is thought that upregulation of TSC22 Domain Family Member 3 (TSC22D3, C-A+ quadrant) and glucocorticoid-induced leucine zipper (GILZ; encoded in the same locus) work with interferon-1 to induce glucocorticoid resistance in SLE patients, dampening their treatment response [
103]. Protein Phosphatase 1 Regulatory Subunit 15A (PPP1R15A, C-A+ quadrant) is also upregulated by SLE and assists with the production of pro-inflammatory cytokines such as interferons, which contribute particularly to the pathogenesis of interferon-high cases of lupus [
104]. Treatment-induced antagonism or knockdown of the gene products in these cases could also improve patient outcomes. Parathymosin (PTMS, C+A- quadrant), Plexin A1 (PLXNA1, C+A- quadrant), ADP Ribosylation Factor Like GTPase 4A (ARL4A, C-A+ quadrant), are novel immune imbalance results with no known mechanisms and with promising potential to change pathogenic properties of lupus and lymphoma B-cells, which justifies future experimentation.
While our IIT results can be useful, it is important to explicitly recognize at least some of the inherent limitations of this study. First and foremost, we recognize that not all types of B-cell lymphoma were studied since our sample pool was limited to publicly available datasets on the Gene Expression Omnibus (GEO). Due to it being one of the most common types of B-cell Non-Hodgkin’s lymphoma, and one of the most aggressive, the majority of lymphoma samples included were from the diffuse large B-cell lymphoma subtype. As such, we expect that these results will be especially applicable to diffuse large B-cell lymphoma, with limited relevance being likely for the other lymphoma subtypes, due to fewer contributed samples.
Another potential area of limitation arises from including multiple sequencing projects in the same differential gene expression analysis. As no studies contained the representative samples we needed, we decided to combine multiple studies and indirectly account for batch effects within the algorithm itself. The gene-specific batch effects are particularly important, which would boost or lower read counts for a specific gene. We have adjusted for this Using two strategies: 1) generating a null distribution and calculating a rigorous p-value from the 18,486,468 data points included in the null distribution and 2) analyzing the frequency of each gene appearing in the significant portion of the null distribution and removing those that appeared the most. We generated the null distribution by making 1,000 permutations of the initial dataset, randomly shuffling sample phenotypes before the analysis, then deriving the final IIT scores for all genes across all permutations and including these with the original gene results. This null distribution can drown out noise by providing ~18.5 million additional data points, making a p-value cutoff much more rigorous. Results with higher statistical noise and lower IIT scores (and presumably less biological relevance) become insignificant when using the null distribution p-value as a cutoff. Additionally, we applied a frequency gate to control for genes with high statistical noise that could be attributed to batch effects. The frequency gate removed genes that appeared abundantly, at high frequency, within the null distribution’s significant results. The rigor of this null-distribution and frequency-gate approach increases our confidence in the significant IIT results.
Our novel algorithm, Immune Imbalance Transcriptomics (IIT) identifies the shared and unique transcriptional mechanisms between cancer and autoimmunity in the same tissue (or cell) type. The biological use case for this algorithm comparing lupus and lymphoma B-cells is an important step forward and suggests that an algorithmic approach may enable the ability to glean additional mechanistic insights and continue to advance the state of diagnostics and treatment for cancer and autoimmune patients alike. The IIT approach allows for a more robust consideration of the effects of immune-related gene products on identifying therapeutic targets that can help to return the lupus and lymphoma phenotypes to a more immune-balanced, non-inflamed homeostasis. We anticipate that the identified mechanisms and targets will aid with future efforts to develop improved therapeutics for both lupus and lymphoma. As one of the first reproducible methods for comparing cancer and autoimmune transcriptional effects, our IIT algorithm potentially provides a novel approach to a better understanding of the mechanisms of lupus and lymphoma disease and how to target these mechanisms without upsetting the immune balance. We expect that this approach can improve the quality of life for patients by providing effective treatment options with fewer and less severe side effects.
Author Contributions
Conceptualization, Naomi Rapier-Sharman and Brett E. Pickett; Data curation, Naomi Rapier-Sharman, Sehi Kim, Maddie Mudrow and Michael T. Told; Formal analysis, Naomi Rapier-Sharman, Sehi Kim, Maddie Mudrow and Michael T. Told; Investigation, Naomi Rapier-Sharman, Sehi Kim, Maddie Mudrow and Michael T. Told; Methodology, Naomi Rapier-Sharman, Lane Fischer, Liesl Fawson, Stephen R. Piccolo and Brett E. Pickett; Project administration, Brett E. Pickett; Resources, Kim L. O’Neill; Software, Naomi Rapier-Sharman, Stephen R. Piccolo and Brett E. Pickett; Supervision, Brett E. Pickett; Validation, Naomi Rapier-Sharman, Sehi Kim, Maddie Mudrow, Stephen R. Piccolo and Brett E. Pickett; Visualization, Naomi Rapier-Sharman; Writing – original draft, Naomi Rapier-Sharman, Sehi Kim, Maddie Mudrow, Joseph Parry and Brett E. Pickett; Writing – review & editing, Naomi Rapier-Sharman, Sehi Kim, Maddie Mudrow, Joseph Parry, Brian D. Poole, Kim L. O’Neill, Stephen R. Piccolo and Brett E. Pickett. All authors have read and agreed to the published version of the manuscript.