3.1. Correlation of GPX2 Levels in Tumors and Cell Lines
To ascribe the importance of antioxidant enzymes that reduce ROOH, it is necessary to demonstrate that an enzyme is present in a significant quantity (
Figure 1) [
1,
2]. The initial characterization of
GPX2 to be highly expressed in the gastrointestinal (GI) tract and undetectable in many other tissues by Northern blotting is still valid (
Figure 2A) [
1,
2,
19]. Based on results compiled by the TIMER2.0 database, in general agreement with TCGA (ref. [
1]),
GPX2 is highly expressed at the tissue level in the mid-lower GI (corresponding cancer, TCGA abbr.; colon/rectum, COAD/READ; 457 COAD samples vs. 41 normal and 166 READ samples vs. 10 normal in
Figure 2A; 5/2024), bladder (BLCA; 408 vs. 19 normal), esophagus (ESCA; 189 vs. 11 normal), head and neck tissues (HNSC; 520 vs. 44 normal), liver (LIHC; 371 vs. 50 normal), stomach (STAD; 415 vs. 35 normal), in cancerous pancreas (PAAD; 178 vs. 4 normal; median for normal is several logs higher than other database values; ref. [
1]) and lung squamous cell carcinoma (LUSC; 501 vs. 51 normal), although at a lower level in lung adenocarcinoma (LUAD; 515 vs. 59 normal) (
Figure 2A) [
1]. High expression in normal tissues is often confined to a few cell types. The small intestine is a prime example of the limited zone of expression within a high expressing tissue, apparently just the easily recognizable Paneth cells (
Figure 3; THPA). This creates one of the problems in analysing
GPX2 using tissue level metrics from databases as will be elaborated and is not unique to the GI-tract (
Figure 2 and
Figure 3) [
1].
Cell line
GPX2 gene expression levels were compiled on the basis of tissue and cancer of origen (DepMap; 5/2024). As shown in
Figure 2B, cell lines derived from COAD/READ as well as from ESCA/STAD have a high median level of
GPX2 gene expression, while many cell lines from HNSC and PAAD also have somewhat elevated
GPX2 expression, based on median levels. There is some correlation between the respective median tumor levels and median cell lines levels (
Figure 2E). The low
GPX2 levels in cell lines derived from the high expressing cancer sources, COAD/READ and ESCA/STAD, are somewhat unique among the antioxidant enzyme genes,
GPX1 and
PRDX1-6. A few cell lines have lower than expected
GPX1 and
PRDX levels, but nothing like
GPX2 (
GPX2, 59/128 low expressing lines (≤64 TPM; ref. [
1]);
GPX1, 13/128;
PRDX1, 0/128;
PRDX2, 6/128;
PRDX3, 3/128;
PRDX4, 6/128;
PRDX5, 0/128;
PRDX6, 2/128;
Figure 4) [
1]. The low expression of
GPX2 in cell lines could be anticipated for BLCA, HNSC, LICH, and PAAD from the variation observed in tumors and is not explained by variation among COAD/READ and ESCA/STAD samples at the tissue level (
Figure 2A).
For a variety of reasons, the relative TPM levels among tissues and cell lines presented represent only a rough guide to thinking about where GPX2 might have a significant role (
Figure 2 and
Figure 4). DepMap TPM results have some predictive value for GPX activity in cell lines (
Figure 4 E; selenium-supplemented culture media for cell line activity) [
1]. GPX activity at and above 190 mU/mg is in the range of normal tissue and cancer sample levels [
1]. One word of caution is made with regards to the
GPX TPM data of DepMap and THPA and for work in almost all current papers [
1]. The two database projects and almost all current published work did not include supplementation of the culture media with selenium (10% FBS/FCS; fetal bovine/calf serum; some rare exceptions in DepMap in combination with low serum for culture) [
20]. As a rule, this is suboptimal for GPX1 and GPX2 protein and activity levels in cell lines [
1,
2,
21]. Different batches of serum can have widely different selenium levels, and this will have an impact on the protein and activity levels of both GPX1 and GPX2 and possibly on the mRNA levels of
GPX1 [
21,
22,
23].
GPX1 mRNA levels may be underrepresented in the databases and in studies [
1]. Selenium supplementation could conceivably shift up to 8 of the 13 low expressing
GPX1 level GI tract-derived cell lines into the high expression range (2-fold increase; see HepG2 in ref. [
22]). In studies, the protein and GPX activities will often be up to one-half to one-fourth the optimal levels and can be even less, the lower activity a feature of cell lines expressing only
GPX1 [
22,
23]. This could impact the reproducibility of findings and downplay the role of GPX1 in favor of GPX2 [
1].
3.3. Remaining Uncertainties in GPX2 Expression in Normal Colon/Rectum
The reasons for lack of a clear answer to the choice of cell lines include no direct link between the cell types that naturally express
GPX2 at high levels and its expression in tumors. Despite decades of characterization, the exact cell expression profile in the colon is unclear [
29]. With studies using single-cell profiling of normal colon and CRC available, it should be a simple matter to establish the identity of the normal high
GPX2 expressing cells and possibly infer
GPX2 expression levels for sources of CRC cells, this based on data from polyps representing an early point in the malignancy continuum [
15,
32]. Variable co-expression profiles between
GPX2 and
NOX1 might aid in this process (
Figure 2 and
Figure 5).
The possible candidates for high
GPX2 expression are Paneth cells (
Figure 3; mice develop tumors in the small intestine) or the colon equivalent, deep secretory cells (so far, identified in mice and not documented for
GPX2 expression) [
33]. Refined localization of GPX2 expressing cells in human colon by IHC shows that they represent only a few cells at the base of each gland, paralleling small intestine (
Figure 3) [
29]. A single-cell analysis did confirm high
GPX2 expression in Paneth cells, while being somewhat unclear about other cell types, particularly in colon [
34]. Paneth cells are not known to give rise to tumors, although they exhibit plasticity during loss of LGR5+ crypt cells that allows them to gain stem cell properties and repopulate crypts [
35,
36,
37,
38]. Whether they retained
GPX2 expression in this process was not determined. It is this type of plasticity exhibited across many cell types in the colon during carcinogenesis that lends itself to uncertainty over whether
GPX2 is being up regulated and to what extent [
36]. Up regulation would be the default when any cell types outside of the lower crypt/glands are the source of the tumors (
Figure 3) [
39]. There is some evidence that up regulation of
GPX2 mRNA levels on the order of 4-5-fold in Stem-like cells occurs during the progress from early lesions to CRC [
32]. Similarly, looking at whatever cells are expressing GPX2 protein at high levels in normal colon suggests possible elevation up to 5-fold in protein levels in CRC cells (statistically significant) [
29]. However, that determination did not establish any link between the normal cells and the cell types that gave rise to the tumors. If the candidate cells are not the naturally high
GPX2 expressing cells, the magnitude of elevation could be tens-hundred-fold, providing a strong rationale for GPX2 impacting tumorigenesis by modulation of ROOH levels. There is ample evidence that an alteration of
GPX2 levels from essentially no expression to levels characteristic of most CRC samples (~27% of total antioxidant enzyme TPM; GPXs, PRDXs and catalase) would impact tumor pathways via known redox sensitive proteins, such as PTEN (
Figure 1A and
Figure 4D) [
1,
40,
41].
3.4. The Problems with Databases Combined with the Issues of Normal Cell Expression
Note, the up regulation of
GPX2 expression is not being based on data from sites like TCGA/THPA, TIMER2.0 and GEPIA2. GEPIA2 data present real problems with exceptionally low
GPX2 expression found in many normal samples (
Figure 6). This extends to several other genes whose expression is confined to the mucosa as opposed to those expressed in the mucosa and mucularis, like villin vs. β-actin (
Figure 6).
TCGA/ THPA, TIMER2.0 and GEPIA2 tumor databases are consistent in showing CRC tumors as having high
GPX2 expression, with only few exceptions (28/597 CRC samples TPM< 256, THPA; TIMER2.0 has ~18/623;
Figure 2A; GEPIA2 ~12/397;
Figure 6). The issue with normal samples in GEPIA2 might stem from the divergent protocols for processing between the 2 sampling sites mentioned in GTEx (
https://gtexportal.org/home/), the source of much of the data. For transverse colon, full-thickness samples were analyzed, while for sigmoid colon, only the mucularis was sampled.
GPX2 is not expressed outside of the epithelium of the mucosa, except for scattered cancer-associated fibroblasts (CAF) (
Figure 3) [
29,
30,
31,
42]. It is not clear that this is a complete explanation for the discrepancy between the GEPIA2 data sets and TCGA/THPA and TIMER2.0. The rectum set does not mention similarly divergent processing protocols and the sample numbers for the alternatively processed colon samples do not add up to the total number of samples listed. TCGA/THPA and TIMER2.0 show much lesser range of increase in
GPX2 expression between normal and CRC, on the order of 2-20-fold (see matched sets, ref. 43) with a few matched samples showing a decrease in CRC or no alteration [
43].
Part of the issue in comparing normal to CRC using TPM/FPKM or standard IHC presentation (combined intensity and proportion scores), beyond lack of clear identification of the source cells, is the limited numbers of normal high expressing cells and the evident increase in the proportion of expressing tumor cells in CRC samples (tumor purity) (
Figure 3 and 7) [
29,
30,
44].
The upper range of the increase in tissue expression of 20-fold could be accounted for by the 4-5-fold increase in cell level expression and a 4-5-fold increase in the relative volume of the expressing cells in tumor samples, although many CRC samples seem to exceed this increase in proportion based on IHC at THPA (
Figure 7). None of this might be relevant if the source tumor cells are not represented by the naturally high
GPX2 expressing normal cells. The question of source cells is complicated by alterations in the GPX2 protein and mRNA expression pattern observed in mouse colon after dextran sodium sulfate-induced injury. Here, the zone of high expression was expanded to the luminal region showing that many cell types in the colon can produce high GPX2 levels under stress [
45]. There may be differences between mice and humans in the range of GPX2 protein expression in ileum and colon. IHC of mice seems to show high levels of protein at the crypt/gland base and moderate levels of protein detected up to the villus tip/lumen, while one IHC study of human samples seems to show expression only in the gland base and in a second the image is in gray-tone and it is unclear by inspection whether any protein was detected outside of the narrow range of high levels at the gland base [
29,
30,
46]. The authors comment that GPX2 was detected in the ileum crypts and colon gland base and don’t mention detection outside of those zones [
30]. In a third case, GPX2 was detected at the lumen with decreased levels from the colon gland base, while confined to Paneth cells in the ileum [
31]. Enlargement of the image of human small intestine GPX2 IHC in
Figure 3 hints at very low expression in patches in the upper crypt and up to the villus tip, however, the call by THPA is that cell types other than the Paneth cells are negative for expression.
3.5. Efforts to Assist Investigators in Selection of Cell Lines
The suitable selection of cell lines for CRC studies is made possible by analysis of CRC for driver and passenger mutations (classical analysis) and later efforts to classify tumors by small- or large-scale analysis of mRNA expression profiles and IHC, often combined with mutation profiles and epigenetic modification profiles; Cimp, CpG island methylator phenotype; MSI, microsatellite instability (CMS, CCS, CRIS, iCMS and others) [
7,
8,
9,
10,
11,
12,
13,
14,
15,
47,
48,
49]. In some cases, cell lines were similarly classified in attempts to match them with CRC types (
Figure 8) [
6,
7,
8,
9,
10,
11,
12,
13,
14,
15].
The key point is CRC tumors can be divided into up to six classes and inaddition to differences in gene expression profiles, each category has a tendancy for certain driver mutations, levels of copy number variation, CIMP profiles, MSS/MSI status and important for this discussion, difference in poor relapse free survivial (RSF) and metastasis. The classification systems of the various groups while showing some discrepancies are similar enough that they were consolidated by Guinney et al into a concensus classification system (CMS; Fig 8H; involving groups in panels B-G) [
49]. The availability of high throughput transcriptomics drove much of the classification, being relatable to cellular phenotypes and the clinical behavior of tumors. The concensus classification uses a neutral terminology (CMS1-4), however, this is sometime phrased as enterocyte- (Ent-), goblet- (Gob-, CMS3), transit amplifying- (TA-, CMS2), inflammatory- (Inf-, CMS1) and stem-like (Stem-, CMS4), suggesting gene expression level affinities to normal cell types of the colon (
Figure 8A and F) [
6,
7]. CMS1 tumors tend to be hypermutated and is largely MSI and CIMP+, with
BRAF mutations, and low copy number variation. CMS2 has high copy number variation, prevalent
APC and
TP53 mutation and is MSS. CMS3 has a high prevalence of
KRAS and
APC mutations with some
TP53 mutation, moderate levels of CIMP+ and is MSS. CMS4 has high copy number variation, is MSS, and has moderate levels of mutation in
KRAS,
APC and
TP53, and largely CIMP-. Guinney et al also provide an extensive analysis of differential properties for WNT/MYC target expression (CMS2), metabolic processes (CMS3, sugars, fatty acids), immune processes (CMS1), and EMT/TGF-β activation (CMS4) among others. Consistent among the individual classification schemes and preserved in CMS is the finding of poor relapse free survival in CMS4 and, in Guinney et al, over all survival as well. CMS4 is also noted for metastatsis, associated with high EMT pathway expression. CMS1 is associated with poor survival after relapse. These last factors will be important in the discussion of cell lines choices and it is worth double checking potential cell line choices for consistency with the MSS/MSI, Cimp+/-, copy number variation and mutation status properties of the related tumor class. All of this was intended to produce better prognosis criteria, precision therapies, and inform the process of cell line selection for pre-clinical testing of such therapies, Medico et al concentrating on the choice of cell lines (Fig 8A) [
6]. An anomaly in the CRC nomenclature system of Medico et al (inherited from Sadanandam et al) is that in applying a system based on markers for normal cell types to cancer, some classical stem cell and WNT markers are associated with the TA-like class (
ASCL2,
RNF43,
ZNFR3,
AXIN2) (
Figure 8A) [
50]. Sadanandam et al stated that the main characteristics of the Stem-like CRC class were high expression of myoepithelial and mesenchymal genes and lack of differentiation. While they also say this CRC class has high WNT marker expression, this was not found to be a consistent property of the cell lines that were sorted into this category based on the stripped down iCMS marker set and was not supported by the findings of the CMS analysis for tumors of the CMS4 set (
Figure 9) [
15,
49]. Lack of differentiation with high WNT marker status was oddly found in cell lines scattered across the Ent-, Gob- , Inf-, and TA-like classes.
One notable outcome of the collective efforts was the relatively consistent classification of some cell lines into epithelial to mesenchymal transition (EMT)/TGFB pathway activation/ undifferentiated/low WNT expression class (
Figure 8). A characteristic of this set of cell lines in most classification schemes is low
GPX2 and
NOX1 expression (median levels- 4.35 TPM and 0.1 TPM vs. 588 TPM and 9.4 TPM in other classes, respectively). There is some inconsistency among the classification schemes for placement of cell lines, and low expressing lines are sometimes split among 2 or 3 classes within schemes (
Figure 8A-I). For CMS, only 34 cell lines have been independently classified with 23 found in DepMap (Fig 8H). The results largely recapitulate Medico et al, so the results of the Medico cell line classifications will be used for further discussion. Since the main goal was to observe how high and low
GPX2 expressing cell lines sorted based on the various CRC typing methods, extensive statistical analyses were not performed. For the Medico et al analysis, Ent-, Gob- and TA-like
GPX2 expression levels are not different and Inf- and Stem-like are not different. Inf and Stem are significantly different from Ent-, Gob- and TA-like (
Figure 8A) (p≤ 0.021; Mann-Whitney; log2 transformed data directly from DepMap; PRISM 9.3). For CMS sets, CMS1 is not different from any other set
; this classification corresponds to the Inf-like cell lines of Medico et al. [
13,
49]. CMS2 (largely TA-like) and CMS3 (mix of Gob-, TA-, and Inf-like) are not different, while both are different from CMS4, corresponding to a mix of the Stem-like and Inf-like sets of the Medico et al. classification (
Figure 8H) (p≤0.0056). The mucinous adenocarcinoma marker analysis showed the least goblet set to be different from the core goblet and hybrid sets, which were not different (
Figure 8I) (p<0.0002). In the case of cell line origin, primary tumor or metastases, and MSI, no significant differences were found (
Figure 8J).
Possible evidence of deep secretory-like derived cell lines (colon Paneth cell equivalent; possible candidates for high
GPX2 expression) in the DepMap set can be found using the differential markers indicated in Sasaki et al [
33]. It is not clear whether such cells would have any tumorigenic capacity or whether any of the original differential markers of the postulated cell type would remain detectable when or if a transition like that of Paneth cells acquiring stem cell properties occurred. Presumably having acquired high WNT marker status while retaining some expression of differentiation markers,
CD24,
REG4,
MMP7,
DLL4,
EGF, S
PDEF, and
ATOH1, they could be represented by 20 of 40 cell lines of the Medico et al classification scheme that overlap with DepMap, excluding stem-like and the NF categories, with
GPX2 levels >190 TPM and 1 stem-like cell line at 1 TPM (results not shown). This is pure speculation, as mice were studied for evidence of deep secretory cells, and they are not yet documented in human samples.
3.6. The Issue of Low GPX2 Expressing CRC-Derived Cell Lines and Circumvention
Where do low expressing cell lines come from? In cancers of the GI tract
GPX2 expression is confined to uniform high levels, more so for COAD and READ (
Figure 1A). In other cancers the range of expression is enormous, possibly explaining the establishment of some cell lines with low levels. Some low expressing cell lines from CRC have mutant APC and others with wild-type APC have high expression of at least one WNT ligand (DepMap) (
Figure 4). Thus, the WNT dependence of
GPX2 expression, demonstrable in some CRC-derived cell lines and mice, does not seem to be the reason for low expression [
51]. Analysis for
BRAF,
KRAS,
PIK3CA, and
TP53 mutation did not reveal any association with
GPX2 expression levels in CRC-derived cell lines (DepMap; results not shown).
SMAD4 mutations are more frequent among the high
GPX2 expressing lines, but also found in intermediate and low expressing lines (11 of 14 cell lines with mutations,
GPX2 TPM> 64; 22% of high expressing lines, 11% of low expressing lines; p=0.36).
GPX2 expression levels are linked to NRF2 activation [
2]. There aren’t potent activating agents in standard culture media and this could be one factor in low
GPX2 expression. TP63 is one of the major drivers of
GPX2 expression in other tissues and cancers and there is sometimes a demonstrable correlation between
TP63 and
GPX2 levels among cell lines derived from these other cancers (
Figure 10) [
52].
TP63 expression is notably absent in all but a few CRC-derived cell lines being replaced by WNT (
Figure 10) [
51].
Because the classification of cell lines was done after the fact, there will discrepancies with the properties of the related tumor types. One of more glaring issues is the excess of CIMP+ cell lines in the stem-like (CMS4) class. There is some trend for CIMP+ cell lines of the Inf- and Stem-like catagories to have low
GPX2 expression (<6 TPM; 3 of 4 Inf-, and 4 of 5 Stem-like; note, CIMP+ is not a general property of the corresponding CMS4 class for Stem-like tumors and is a property of the corresponding CMS1 class for Inf-like tumors), while CIMP+ lines in the Ent-, Gob- and TA-like sets retain higher expression (≥60 TPM;
Figure 10E). This might indicate suppression of
GPX2 gene expression by this repressive mark [
53,
54]. Two CIMP-, Stem-like lines also have low
GPX2 expression, so it is not clear what the relations are; CIMP+ in the Inf- and Stem-like catagories specifically leading to low
GPX2 expression or more likely the condition of being Inf- or Stem-like leading to low
GPX2 expression as the primary effect while the CIMP+ condition is an incidental feature over represented in CRC cell lines relative to tumors [
13,
53,
54]. One criticism of cell lines is that there is a tendency for excess CpG methylation (a little less extreme in CRC-derived lines, 5-fold), although methylation sites found in CRC tend to be preserved. This was attributed to myriad effects of cell line derivation and cell culture conditions, to which loss of variability of CRC-derived lines compared to CRC was also blamed [
53,
54,
55].
Based on the consistent EMT expression profile, it might be conjectured that the low
GPX2 expressing cell lines derive largely from metastases. This is not the case (
Figure 8J). In fact, one study attempting to resolve the roles of GPX2 in tumorigenesis and metastasis showed that while HT29 cells with
GPX2 knocked down showed poor growth as xenografts in mice, they could metastasize, although demonstrating a lower potential (
GPX2 levels in HT29 at 1010 TPM -averaging DepMap and THPA values; about 20-fold knockdown;
NOX1 16 TPM) [
56]. However, follow up analysis showed that the metastatic tumors were a subset that either didn’t fully respond to the knockdown or re-established higher
GPX2 expression in the process (IHC). Since HT29 cells were classified as CCS3 (
Figure 8D, EMT set; poor relapse free survival (RFS)), the authors went on to examine RFS in CCS3 cancers stratified by low and high
GPX2 expression, with high expression being unfavorable; questionable significance (TCGA/THPA does not confirm any prognostic relationship between
GPX2 and overall survival (OS), lumping all classes together;
Figure 11).
They noted that the median level of
GPX2 expression in CCS3 CRC subsets was ~512 TPM (correcting for median tumor purity in CRC-820 TPM) much higher than the 5 TPM characteristic of most CCS3-type cell lines (CCS1 vs CCS2 and CCS3, p≤0.0006;
Figure 8D) and that the variation in expression levels was only 4-5-fold among the CCS3 tumors [
5]. The relevance of a 20-fold knockdown is questionable in the context of probing RFS. It might be reasonable as a therapeutic goal given the general negative impact on tumor growth and metastasis. HT29 is generally not classed into the EMT/TGFB pathway activation/ undifferentiated/low WNT expression set among classification schemes (
Figure 8). The cell line is somewhat differentiated (some
MUC2,
TFF3 expression and high
LYZ;
ATOH1,
HOXB2,
KLF5,
TFF3; iCMS), has somewhat low TGFB pathway marker expression (
CCN1,
ITGAV,
TGFB) and low EMT marker levels (
FN1,
CDH2,
VIM) while sharing low WNT marker expression (
ASCL2,
CDX2,
LGR5,
MYC,
TCF7,
TEAD; iCMS) characteristic of the EMT sets (
Figure 9) [
57,
58,
59]. HT29 can undergo differentiation in culture with the substitution of galactose for glucose in the media, with some cells exhibiting goblet cell-like morphology and mucin production [
60]. Slightly lower GPX activity in HT29 was associated with differentiation by 5-Fluorouracil [
61]. This work predates wide-spread recognition of GPX isoenzymes so GPX1 and GPX2 were not separately examined. As a side note, galactose increases mitochondrial H
2O
2 release, hinting at a role for ROS in the differentiation process along with the reported lowering of GPX activity [
62].
Nine cell lines with higher
GPX2 levels are classified by 2 or more of the schemes referenced in
Figure 8 into the EMT/undifferentiated categories; CACO2 (3 schemes), CL40 (2 schemes), LOVO (3 schemes), LS180 (2 schemes), SKCO1 (2 schemes), SNU1040 (3 schemes), SW620 (5 schemes), SNU503 (5 schemes), and SNU1033 (3 schemes) with
GPX2 expression levels of 118, 918, 290, 296, 1711, 1020, 648, 2040, and 318 TPM, respectively (average, DepMap and THPA) (
Table 1) [
7,
8,
9,
10,
11,
12].
There are discrepancies with and between Medico et al/Sadanandam et al; LS180 and SKCO1 are classified as Gob- by both, while SNU1040 is split Ent-/Gob- and SW620 is split TA-/Stem-like. This appraisal might be further limited by the suggestion that MSS tumors of the CMS4 class are most likely to metastasize, leaving CACO2, CL40, SKCO1 (metastasis), SNU503 (metastasis), SNU1033 and SW620 (metastasis) as representative cell lines with potentially high metastatic potential [
15]. However, LOVO (MSI; metastasis) and HCT116 (MSI; low
GPX2 expression; identified in the EMT/undifferentiated classes in all schemes) were found to have metastatic potential in a mouse xenograft model rivaling that of SW620, a cell line derived from a metastasis with metastatic potential demonstrable in 2 mouse model studies [
63,
64]. There may be a few seemingly EMT/TGFB pathway activation/undifferentiated/low WNT expression category cell lines with
GPX2 expression levels approaching tumor levels, which seems desirable in testing the role of GPX2 in this class of tumors based on the commonly used suppression of
GPX2 expression in cells lines as a mode of study; notably, CACO2, SNU503, SNU1033, and SW620 for Stem-like cell lines. CL40 and LOVO might be as appropiate as high expressing lines for the Inf-like class (Medico et al/Sadanandam et al.) [
6,
7]. Within these two classifications, paired high and low expressing lines may represent cases where the assumption that a major difference is in
GPX2 levels is somewhat justified and the use of forced over-expression and silencing might yield relevant findings, with the caveats that very low expression in cell lines is not yet supported by CRC expression levels and differences in expression levels of
CST6,
EGFR ,
MYC,
TP53,
VIM and other putative redox sensitive proteins may be present.
3.7. How Different are CRC-Derived Cell Lines Sorted into Classes by Medico et al for Expression Levels of Genes Encoding Proteins with Reactive Sulfhydryl Groups?
While the CRC-derived cell line sorting process is largely based on differential gene expression profiles, just how different are the cell lines for mRNA levels of proteins with known or suspected reactive sulfhydryl groups that impact function when oxidized? While the list of potentially affected proteins is large and growing, a few key proteins were selected for evaluation in DepMAP [
2,
65]. Uniformity and variability in expression levels were found across the groups of cell lines in the Medico et al classification scheme (DepMap) [
66,
67,
68]. 14-3-3-gamma (
YWHAG), c-ABL (
ABL1),
ASK1,
AKT1,
AKT2,
HIF1A,
JNK2,
JUN2,
ERK2,
KEAP1, MAPKs,
NFKB, Nucleoredoxin (
NXN), p38 (
MAPK14), protein kinase A (
PRKACA),
PTEN,
PTPN1B,
SHP1,
SHP2,
SRC, and
STAT3 have uniform expression levels across classes. Cystatin B (
CST6),
EGFR,
MYC,
TP53, and vimentin (
VIM) show skewing for certain classifications;
EGFR levels tend to be slightly higher (2-4-fold) in low
GPX2 expressing lines, enriched in Stem- and Inf-like cell lines, some with high
GPX2 levels, although a few very low expressing lines have nearly zero levels; 4-fold lower expression of
TP53 is associated with the high
GPX2 expressing TA-like cell lines compared to the other classes; high levels of
CST6, 16-fold or more, are generally associated with the low
GPX2 expressing Stem- and Inf-like classes;
VIM is slightly skewed in favor of low
GPX2 expressing cell lines (2-4-fold), with 5 Stem-like low expressing lines and SW620, a high expressing Stem-like line (most schemes; TA-like in Medico et al;
Figure 8 and
Table 1), showing roughly 8-32-fold higher expression than the bulk.
MYC shows slight skewing (2-fold) in favor of low
GPX2 expressing Stem- and Inf-like cell lines. It could be argued that for most redox sensitive proteins upstream of major pathways there would be a uniform impact of ROOH modulation across CRC classifications with a few exceptions and, that in many cases mRNA levels do not reflect functionality of the proteins. However, this exercise can only be efficiently managed using the available mRNA expression data.
CST6,
EGFR and
VIM might be exceptions, where mRNA levels reflect function. The other point is that the downstream protein profiles may be variable among the CRC-derived cell lines based on the classification method; Medico et al used 783 differentially expressed genes in their profiling [
6]. That is,
GPX2 expression level differences generally occur in a context of broad gene level expression differences and not in isolation. Forced overexpression and silencing of
GPX2 in one or a few cell lines may not reflect this situation, biasing results in favor of the impact of the consensus redox sensitive proteins and downstream pathways, presuming these remain largely unaffected by the manipulations. However, it was shown above that there may be a few suitable pairings of high and low
GPX2 expressing lines for the Stem- and Inf-like classes. While the argument has been made that GPX1, PRDXs and CAT impact should also be accounted for, a broader systems approach may be required to understand how
GPX2 expression alterations (and the other antioxidants) fit into the scheme of CRC and what drives the large-scale gene expression changes that include
GPX2.
3.8. Intratumor Variation, Plasticity of Cell Components, Stromal Cell Interactions and GPX2
There is a general trend for high
GPX2 expression in normal colorectal samples at the tissue level (
Figure 2;
?). This appears to be achieved with a limited zone of high expression at the base of the glands [
29,
30,
31]. There is disagreement on whether a low level of GPX2 expression extends to the top of the gland. While one study mentioned above showed individual CRC tumor cells to have elevated GPX2 protein levels relative to expressing normal cells in a somewhat small set, it is not clear that this is universal in CRC samples [
29]. THPA GPX2 IHC sets show a few tumors where GPX2 intensity is quite variable in the epithelium, with portions of tumors showing almost no staining adjacent to strongly strained sections and other lesser variation is also observed (
Figure 7A and B). The unstained or lightly stained portions are likely the normal epithelial cell component of tumors (not considered as a source of CRC-derived cells lines) that occurs at an average 24% of epithelial cells isolated from tumors in the iCMS analysis, presumably having low levels of GPX2 relative to the tumor cells [
15]. The sample shown in
Figure 7B might represent a tumor with a normal component above this average, while panels C and D seem to be below the average. The reverse staining pattern is unlikely given the low numbers of normal high
GPX2 expressing cells, unless alterations in the tumor stromal cellular environment promote aberrant growth of normal cells along the lines of Paneth cells in tissue injury (postulated deep secretory cells in colon) [
30,
32,
33,
34,
35]. Whether any of this variation relates to the tumor cell component has not been systematically studied so we would have no idea about how prevalent it is as a pattern and whether it relates to CRC subtypes [
15]. If this involves the tumor component, a conjecture is the CMS4/CCS3 class may be more represented explaining the propensity for low
GPX2 expressing cell lines from this class (
Figure 8). This can be readily tested, and if the variation involves the tumor cell component this could justify the use of low expressing cell lines in studies.
GPX2 expression in primary tumors may be dependent on factors secreted by CAFs and other stromal cells and portions of the tumors may lack sufficient contact with these cells to support
GPX2 expression. Growth in primary sites seems to be somewhat dependent on stromal support [
69,
70,
71]. Tumor cells seem to be plastic, altering their properties, which might include cellular factors affecting
GPX2 expression, separate from stromal factors [
38]. Stromal dependency, while possible, is countered by the more general tendency of cell lines to retain
GPX2 expression in standard culture conditions and the idea that metastases can be established in the absence of stromal support (DepMap) [
38]. It is estimated that only 10-15% of CRCs can supply cells that will establish cultures [
72]. A general requirement for stromal support in primary tumors and re-establishment of this after metastases have seeded may explain part of this failure. In standard isolation of CRC-derived and other cell lines, fibroblasts are the main contaminant, and it takes several passages to eliminate them, or they may be employed as feeders until overgrowth is evident as a sign of independence of the CRC epithelial cells [
73]. It is possible that a transition from stromal dependency to independent growth occurs by adaptation or selection as the fibroblasts are eliminated [
73]. There are no broadly consistent patterns of gene expression among cell lines as evidenced by the stratification into as many as 6 subtypes based on differences in expression, driver mutations and epigenetic modulation patterns (
Figure 8). It is possible to find many different combinations of driver and passenger mutations in the list of 77 CRC cell lines in DepMap [
47]. Thus, the basis for culturing success is elusive and clearly has nothing to do with
GPX2 expression either directly or in the context of cell properties favoring
GPX2 expression.
A side project on
GPX2 expression could include adding CRC stromal factors back to cultures, or better, using stromal cells as feeders of the low expressing lines looking for induction of
GPX2 levels to the range of 100-2000 TPM [
73]. There is evidence that high
TGFB expression in tumor cells, somewhat greater in low
GPX2 expressing cell lines (DepMap), exerts a paracrine effect on the surrounding stromal cells, which in turn enhances metastasis of fibrotic-type CRC tumors [
74]. This would have to be linked to a broader examination of gene expression alterations to avoid overinterpretation for any impact of GPX2 on subsequent cell line properties. Another point raised by the substitution of galactose for glucose in the culture of HT29 and the change in properties is the presence of additional carbon sources for intestinal cells, short-chain fatty acids and glutamate [
75,
76]. Addition of short-chained fatty acids to culture media seems to evoke changes in cell line properties with a generally negative impact on growth [
77]. Retinoic acid was found to increase
GPX2 expression in cell lines, in particular, MCF7, a line that barely expresses,
GPX2 under basal culture conditions [
2].
As a general statement, based on
Figure 2A and the IHC in
Figure 7 (several other similar examples can be found at THPA), there is an expectation that CRC tumors will have high
GPX2 expression both at the tissue level and throughout the epithelium. A study attempting to understand changes in epithelial properties as CRC tumors develop from polyps found that Stem-like cells showed a consistent increase in
GPX2 levels by 3-5-fold as the malignancy progressed, relative to normal Stem-like cells, with elevation evident in the polyps [
32]. This change in expression levels is coincident with the 3-5-fold increase in GPX2 protein levels expression found by Brzozowa-Zasada et al in the comparison of high expressing normal cells and tumors [
29].
ASCL2 and
OLFM4, stem cell markers, show increases in levels in the same cell population while
HNF4A has a bi-phasic pattern, decreasing from normal to the polyp phase then increasing dramatically in the CRC stage.
HNF4A shows a slightly stronger correlation (R-squared, 0.48) with
GPX2 levels in cell lines than
ASCL2 (R-quared, 0.31) and
OLFM4 (R-squared, 0.18), but for all the residual deviation from the regression line is huge for a large fraction of lines.
GPX2 has been occasionally proposed as a stem cell marker, so that correlations to other such markers in CRC samples (
AXIN2,
RNF43,
ZNRF3, and
SOX9; no correlation with cell line
GPX2, although all are up regulated in CRC) have been made before and associations with WNT proposed and established [
32,
50,
51]. For
HNF4A, the pattern is explained as yielding to WNT dependancy early on, it is an inhibitor of WNT, and fueling CRC later [
32,
78]. The number of independent samples in this study was very small, only three CRC samples, and unable to cover the spectrum of CRC classes. However, there is no indication that
GPX2 levels should be low in tumor samples.
Pathways that drive high
GPX2 expression, indicated by results from tumors, could be replicated in high expressing cell lines with the provision that there may be some dependence on stromal factors in tumor expression that would not be carried over to cell culture. There are many pathways that impact
GPX2 levels, so finding a really strong correlation with any single gene or pathway is unlikely [
2]. Esworthy, Doroshow and Chu presented an overview of genes and pathways that influence
GPX2 levels and the count was at least 14, at that time [
2]. NFR2, WNT, TP63, and RARE have been mentioned above. Others, including CD13 ,
ETS2,
FOXO1,
FOXM1,
OXR1, PI3K/
AKT and
STAT3 do not show a strong correlation with
GPX2 expression in cell lines and
NKX3-1 is not expressed in colon.
FOXM1 is up regulated in CRC and
FOXO1,
EST1 and particularly, CD13 (
ANPEP), are down regulated (TIMER2.0). HIPPO/YAP is unique, reported to strongly suppress
GPX2 expression. Like
HNF4A, which is in a cooperative feed-back loop of expression with
HNFA1 and therefore involved with the NOTCH pathway, a few master regulators of CRC pathways,
MYB,
REG4, and HIPPO/YAP show a reasonably strong correlation with
GPX2 expression, markers of YAP activation and related genes, a negative effect [
78,
79].
MYB, which is expressed in other tissues, shows marginal up regulation in CRC with a fairly strong positive correlation with
GPX2 in cell lines (R-squared, 0.498) [
80].
REG4 is expressed in normal GI-tract tissues and is not uniformly up regulated in CRC [
33]. The correlation with
GPX2 in cell lines is weaker (R-squared, 0.22).
YAP1 is widely expressed at uniform levels and is not up regulated in CRC. Expression levels of
YAP1 and
TAZ (co-activator) are also uniform across cell lines classes. Markers of YAP/TAZ pathway activation show various degrees of negative correlation with
GPX2 expression in cell lines,
CCN1 (CYR61) with an R-squared of 0.64 (DepMap).
CCN1 is also a marker of the TGFβ pathway, providing two possible reasons for the strong correlation [
17]. YAP activation might occur in cell culture, as it senses mechanical cues related to cell shape and cell spreading [
13,
81]. Thus, low
GPX2 expression in the Stem-like category of cell lines could reflect an impact of conventional cell culture. YAP activation was tied to down regulation of
GPX2 in LUSC through
TP63, which is not expressed in colon or CRC-derived lines, requiring another mechanism of action (
Figure 10)[
82,
83].
3.9. NOX1 in CRC and Links to GPX2
Expression of
NOX1 among CRC-derived cell lines shows a similar pattern of variable expression with a greater proportion of lines exhibiting nearly zero expression relative to
GPX2 (
Figure 5). Very high
NOX1 levels are a unique property of normal colon/rectum and CRC with a narrow range of variability in expression levels in normal and a very broad range of expression in CRC, so that low expression in cell lines is not discrepant as with
GPX2 (
Figure 2 and
Figure 5).
NOXO1, but not
NOXA1, shares unique high expression in colon/rectum (TIMER2.0). The only linkage in co-expression might be that high
NOX1 levels are associated with high
GPX2 levels as observed in cell lines (DepMap) (
Figure 5).
NOX1 mRNA and NOX1 protein expression occurs at crypt/gland base like GPX2 [
26,
27,
28]. Whether co-expression occurs in the same normal cells is not clear. The absence of NOX1 (
NOX1-KO) in mouse colon has some impact on the distribution of differentiated cells (goblet cells) and proliferating cells, favoring differentiation [
84]. This suggests that
NOX1 might be expressed in the TA compartment, which is the cell line assignment of Medico et al [
6]. In the absence of GPX1 and GPX2, NOX1 activity produces pathology in the form of excess crypt/gland apoptosis, which may be dependent on the presence of microbiota and the composition, providing one rationale for very high
GPX2 in colon/rectum [
24,
25,
85,
86]. The pattern in CRC-derived cell lines shows co-expression of
GPX2 and
NOX1 is likely in cancer as property of several, but not all, high WNT expressing and differentiated cell lines (
Figure 5 and
Figure 11). Unlike
GPX2,
NOX1 expression levels are considered prognostic by TCGA/THPA for OS, as of 6/2024 (
Figure 11).
NOX1 stratification yields a significant effect on OS with high expression being favorable. The cut-off for the prognosis call is ~262 TPM (420 TPM, correcting for median purity values), which is greater than 2-times more than even the highest expressing cell line (
Figure 11). There is the question of whether NOX1 would be active in any cell lines. One paper tried to address this by showing that a handful of CRC-derived cell lines at the top of the expression range might be capable of significant spontaneous and PMA-induced production of superoxide from NOX1 [
87]. The knockdown of antioxidant enzyme expression in CRC cell lines might reveal activity as increased levels of apoptosis that should relate to
NOX1 levels, if true.
3.10. Low GPX2 Expression in Cell Lines, Redux
Returning to the issue of how to select cell lines for pre-clinical studies of the involvement of GPX2 in CRC, a major question is whether to use low expressing lines for studies. The observation of low GPX2 protein levels in portions of tumor samples may be insignificant as a consideration in CRC studies, representing normal cells, and low expressing cell lines may be of little interest and utility in CRC studies despite their otherwise interesting properties. Plasticity is noted to occur in cancer stem cells. Single-cell analysis found a low incidence of multiple subtype signatures in individual tumors, probably based on one sample per tumor [
15]. Another study showed that multiple sampling within metastatic tumors revealed unrecognized heterogeneity [
88]. Plasticity in CRC tumor stem cells, the finding of a few tumors with multiple signatures and the likelihood of regional variation within tumors suggests that the coexistence of low and high expressing cells in tumors could occur with selection for alternate states a possibility [
38].
GPX2 expression might not be an essential feature of tumor progress, only a legacy of expression potential. This might thwart efforts to use modulation of
GPX2 levels to achieve therapeutic ends. There is possibly only one example of such behavior in established cell lines and there is a trend for consistent increases in
GPX2 expression along what is termed the malignancy continuum suggesting a singular upward arc during tumorigenesis for Stem- and TA-like cell types [
32]. Cell lines, SW480 and SW620 are isogenic, the former isolated from the primary tumor and the latter from a lymph node metastasis [
89]. Among the many differences, the
GPX2 level in SW480 is ~5.7 TPM and in SW620 is ~648 TPM (averaging results from DepMap and THPA), SW620 has much higher expression of
LGR5 and
ASCL2, and slightly greater signs of differentiation, while they share very low
NOX1 expression (
Figure 5 and
Figure 9). In two studies of metastatic potential in mice, SW620 demonstrated a much greater ability than SW480 [
63,
64]. Comparing
GPX2 levels (DepMap) to the reported metastatic ability among 6 cell lines used in one of the two studies did not show any trend, suggesting other properties of those lines and possibly SW480 and SW620 were responsible [
64]. The paired lines, GP2D and GP5D, retain similar, somewhat low
GPX2 levels, 87 TPM (83 TPM averaging DepMap and THPA) and 59 TPM, similar proliferation and differentiation marker status and low
NOX1 levels (
Figures 5, 9B and 9D). Both were derived from the same primary tumor and exhibit some different properties in culture, including responses to EGF ligands and spontaneous vs. induced EMT [
90]. Supplementation of EGF ligands to culture media could be one more item to explore for impact on
GPX2 expression in addition to short-chained fatty acids and retinoic acid. Other pairings exist or may exist; however, one member of the pair doesn’t appear in the DepMap or THPA databases or the suggested pairing is based on later supposition and not the original description (DLD1, HCT15; primary; similar low
GPX2 and
NOX1 expression; multiple sources support pairing) (
Figure 5) [
6]. CRC organoids might provide a means to explore the plasticity issue [
56].