3.1. Design of a t-SNE Based Protocol for Multicolor Flow Cytometry Analysis
For the t-SNE analysis, a common data set of all FSC files from all patients was created, so that the t-SNE plots are comparable between the patients. The t-SNE analysis was carried out based on the expressions of CD34, CD38, CD45RA, CD123 and PD-L1. In general, the CD34 antigen permits the identification of hematopoietic stem and progenitor cells, while CD38 is considered a marker associated with differentiation [
28,
29]. The combination of these two antigens with CD45RA and CD123 permits a characterization and quantification within a BM of the HSC/HPC within a BM sample [
30,
31].
After the t-SNE run on the combined data set (
Figure 2A), the contributions of the CR and AD patients were visualized separately to recognize their contributions to the combined t-SNE picture (
Figure 2B,C).
By t-SNE, the cells are arranged in five islands (I-V) of different sizes. It becomes immediately apparent that the CR (in 2B) and the AD (in 2C) samples contribute almost complementarily to the combined representation (A). While in the CR patients, most of the cells are in the east part of the main island (I) plus in three of the four separated islands (II, III, V), the cells from the AD patients accumulate more to the west of the main island as well as in the island IV. However, the populations are not mutually exclusive, as all cell types are present in both groups, albeit in some regions with strikingly different prevalence. This phenomenon is most likely not related to “contaminating” leukemic cells within the CR samples as the CR patients are MDR negative. In the t-SNE representation, the cells are distributed according to marker-specific gradients, as shown in the bottom row of
Figure 2D-H for the combined data set, where the black horizontal bars in the color scale column define the corresponding intensity intervals. The overlay of the different markers is shown separately for AD and CR in the
Supplement in Figure S5, as well as an example for two patients from each of the groups in
Figure S6.
As far as CD34 is concerned, the corresponding intensities in (D) comprise only positive values since per definition only cells above the threshold of expression were included. Still, the islands in the t-SNE plot show quite varying CD34 expression levels. With respect to CD38, the cells are assorted from northwest to southeast of the main island with increasing expression level, while it is particularly low in island IV. A pronounced CD38 gradient is visible in island III and V, indicating a sub-ensemble of cells undergoing some kind of development. The CD123 concentration, on the other hand increases from east to west across island I, is almost zero in island II, and shows gradients within islands III and V. The CD45RA expression increases strongly from north to south. Finally, the PD-L1 expression, which is not considered in the assignment of the cells according to
Table 3, is non-monotonously distributed across island I and takes characteristic low values in islands II and III.
We can therefore conclude that the gradients in the intensities of CD38, CD45RA and CD123 cause the main substructure in island I, while the expression levels of CD34 and PD-L1 refine this landscape.
3.2. Exemplifying Discussion of t-SNE Gates
For a more detailed study of the five islands, we have defined 27 gates in the t-SNE plot of the CR samples, each with a characteristic set of expression levels for the markers used (
Figure 3A). This gate pattern was then transferred without modifications to the AD data as described in
Section 2.5, shown in
Figure 3B. The percentage distribution of the cells in the 27 gates for the three datasets (all patients, only CR, only AD) is shown in the
Supplement Table S1.
From that kind of visualization eight gates emerge, namely gates 1, 6, 7, 10-13 and 15, in which the cells of patients with AD dominate. The remaining gates contain more cells from the CR patients, while within gate 14 the ratio is very close to 1.
The box plots of the 27 gates (
Supplement Figure S3) were used to assign the cells within each gate according to the classification scheme as detailed in
Table 3 and shown in
Figure 3F. As can be extracted from
Figure 3, the CR subsets are composed of 0.7% HSC/MPP, 9.4% CMP, 1.2% CLP, 11.4% MEP and 44.6% GMP. There was a proportion of 32.7% cells that could not be allocated according to the classification scheme. On the other hand, the samples of the patients with AD were comprised of 14.3% HSC/MPP, 4.7% CMP, 4.1% CLP, 13.7% MEP and 47.0% GMP with a proportion of 16.2% of the cells which could not be classified. Clearly, in comparison with the CR samples the AD samples show a significantly greater proportion of HSC/MPP as well as CLP cells, while the percentages of CMP cells as well as of those cells that cannot be allocated is smaller. The fractions of MEP cells are approximately equal for both groups.
Beyond this canonical classification, the t-SNE representation provides a rich substructure within the regions of particular cellular subtypes, reflecting subtle differences between the various populations. A complete delineation of all 27 gates would be certainly beyond the scope of our presentation. We therefore selected gates representing five characteristic cellular subsets, namely gates 1, 3, 6, 12 and 26 to illustrate the possibilities, but also the potential shortcomings associated with a t-SNE representation. The corresponding box plots are shown in
Figure 4.
In general, a clear distinction of one gate from the others originates from a particularly low expression of one antigen within this gate.
We begin with
Gate 1, a well-separated island containing the great majority of HSC/MPP cells, as defined by the lack, or extremely low expression, of CD38. As far as CD45RA and CD123 are concerned, their expression levels show a broad distribution spanning almost over the full intensity range. Since AD cells contribute 88% to this population, this gate represents a predominantly leukemic-related gate and is compatible with the signature of leukemic stem cells. To relate these findings to the results of Kersten et al. [
32] we looked at the expression level of CD45RA and CD123 on the CD34
+ cells within this gate and found a greater expression of these antigens on the leukemic cells compared to those from the control samples. The aforementioned investigators examined the potency of CD45RA to specifically discriminate LSC and normal HSC for a better LSC quantification and found that in comparison to other markers such as CLEC12A, CD33 and CD123, CD45RA was the most reliable antigen. From a clinical point of view, it was interesting to note that CD45RA
+ LSC tended to be associated with a more favorable cytogenetic/molecular marker constellation. However, it is important to recognize that the expression of CD45RA in AML is not as straightforward as in the immune system T cell subsets, and the functional implications can be quite diverse [
33]. With regard to CD123, the study by Testa et al. based on the screening of CD123 expression in various hematopoietic malignancies shows that this antigen not only frequently expressed at high levels in AMLs but also on B-ALLs [
34]. In an earlier report, they had explored a large set of AML patients and reported that 45% of these patients overexpress CD123 [
35]. Similar to their results Al-Mawali et al. [
36] found that overall, this antigen was expressed in 37 (97%) out of 38 AML cases analyzed. The median expression of CD123 was 90% (range 21%–99%). Interestingly, the proportion of cells co-expressing CD123 on CD34
+/CD38
− leukemic stem cells was also 37 (97%) out of the 38 AML patients with a broad range from 0.0262% to 39.7% (median 0.8164, mean 4.45) at the time of diagnosis. These results are in line with our findings regarding the expression pattern of the CD34
+/CD38
- in our gate 1.
Gate 3, on the other hand, has been selected as an example for a cell cluster mainly encompassing CD34+ normal progenitor cells of GMP subtype, as the great majority of cells show a strong CD38 expression in the presence of CD45RA and CD123. Different from this normal signature, the few CD34+ cells falling onto this gate from patients with AD are lacking or only faintly expressing CD38 while the intensity for CD45RA and CD123 tends to be stronger in comparison to their normal counterparts.
Gate 6 resides at the edge of the main island with a proportion of 93% of cells from AD samples. Since the expression levels of all antigens are above the threshold of detection, they are formally classified as GMP. Still, a specific property of gate 6 in comparison to other gates containing GMP-like cells is that the PD-L1 expression level is relatively high - well above the levels in all other gates – and the levels of CD45RA and CD38 are also above the average observed for GMP cells. Furthermore, it is remarkable that these cells have a relatively low CD34 antigen expression and that all antigens display a relatively sharp intensity distribution with relatively low standard deviations. This suggests that there is no ongoing evolution among the cells in this gate. The CD34
+ cells of this cluster were to some extent CD38
+, indicating a kind of “late” HSC on its way towards an abnormal stage of differentiation. As far as PD-L1 is concerned, our t-SNE based data confirm the results obtained previously in a study focusing on the immunophenotype of T cells in patients with MDS and AML [
37]. The mechanisms underlying T cell evasion to immune checkpoint inhibitors in acute myeloid leukemia have been recently elucidated by Gurska et al. [
38].
We now take a closer look at the cells in gate 12, where two thirds of the cells originate from AD samples. This gate represents a kind of borderline cell pool regarding the AD samples. In general, the expression level of CD45RA is very low, and the CD34 level is extraordinarily high with a relative broad distribution of CD38 expression. While the cells in this gate from the CR patients are unequivocally classified as CMP, this is not possible for the AD samples, as they rather appear to be a mixture of CMP with HSC/MPP. This gate is therefore distinct from most other gates due to its internal shift of the t-SNE intensity between AD and CR samples. Accordingly, the AD cells with a lack or very low expression of CD38 reside more at the left side of this gate, whereas the cells of the CR samples preferentially group around its center. This indicates that the cells undergo an evolution from HSC/MPP when the disease is active, towards CMP during remission. The cluster contained within gate 12 is thus a nice example for the discriminative strength of t-SNE.
Gate 26 is dominated by a proportion of 68% of CR cells. It is a kind of enigmatic cluster, as this subpopulation of CD34
+ cells could not be allocated unequivocally according to the classification scheme as described in
Figure 3F. Their characterization certainly requires an extended labelling for the characterization of lymphoid progenitor cells including antigen markers like CD10, CD7 or CD19. With regard to the leukemic cells contained within this cluster, aberrant marker constellations not related to the canonic scheme are also conceivable. Therefore, starting from our proof-of-principle marker panel, modifications including new monoclonal antibodies are necessary taking into account the steadily evolving knowledge and discovery of leukemic related antigens and their co-expression patterns. Within this process, our efforts should be geared towards linking the phenotypical characterization to the molecular signature of the leukemic cells in the sense of a phenotype-genotype linkage. In the context of an antigen-targeted therapy, this could be helpful in defining the most relevant subset, i.e., leukemic stem cell, within the bulk mass of leukemic cells.
We proceed by drawing some general conclusions from these characteristic examples.
First of all, carefully selected additional markers can discriminate the cells to a deeper level. In that respect, we found a strong correlation between the expression level of the PD-L1 antigen and the percentage of leukemic cells in a particular gate. Considering that PD-L1 is an immunoprotective antigen, one may speculate that by increasing the PD-L1 expression during the evolution from healthy towards malignant, the cells protect themselves with respect to the immune system.
On the other hand, disregarding a relevant marker can leave the cell population within some of the gates unspecified, as has been seen from the example of gate 26. Moreover, since the markers used tend to show a continuous expression on this cell ensemble, only a few distinct islands became apparent in the t-SNE plot. This means that with manual density-based gating, the areas have sometimes not a distinct border, which is reflected by the variability of the box plots for the respective gates. By using more markers that ideally exclude each other, better separation within the t-SNE plot [
39], may improve subsequent gating or also enable the use of more automated density-based gating, such as DBSCAN [
40] or HDBSCAN [
41].
Furthermore, in our control samples of patients (CR), the composition of the cell ensemble was similar to our previous findings in normal donors showing a predominance of the GMP followed by the CMP, HSC and the MEP [
42]. Subtle differences may be explained by the fact that in our study, BM samples of patients in CR served as normal controls, as BM from normal volunteers were not available. More specifically, normal hematopoietic cells that express high levels of CD34 lacking CD38 are considered stem cells, whereas those that express low levels of CD34 and high levels of CD38 represent more differentiated progenitor cells [
43]. The lack of CD38 on leukemic blast cells is also characteristic for the leukemic stem cell [
44].
3.3. Quantification of the t-SNE Representation
We proceed by asking to what extent a t-SNE based assessment can be quantified. Based on quantitative evaluations already proposed [
45,
46] we suggest an analysis in terms of the Pearson correlation coefficient r (A, B), a well-established measure for the similarity of two pictures labelled A and B. The two pictures are composed of N pixels each, with pixel density A
j and B
j, respectively. The Pearson coefficient is defined as
with the covariance of the two pictures given by
and the standard deviation of the pixel densities of picture X (X=A, B) given by
For r (A, B) = 1, the two pictures are identical, and they are maximally different, i.e., their sum picture has a density of zero at all sites, for r (A, B) = -1.
The comparison of the two representations of the combined data sets,
Figure 2B,C, gives r
ΣCR,ΣAD = 0.46. Here, ΣAD and ΣCR denote the sum pictures of all AD and all CR samples, respectively.
Based on this value, we evaluate a classification protocol in which the t-SNE representation of a new sample is generated by first merging it with a reference plot composed of sufficiently many samples, which is split up again into the two reference pictures ΣAD and ΣCR plus the contribution from the new sample labelled as N.
When we refer to the sample N including its classification, we label it by NCR or NAD, respectively. In the next step, the Pearson coefficients of N with the two reference pictures rΣAD,N and rΣCR,N are computed.
We have implemented this protocol with the present data set as follows. From our data set, we have removed each sample separately and considered the remaining combined pictures as reference pictures. We now treat the individual sample N as unknown and compute r
ΣAD,N as well as r
ΣCR,N. This implementation is represented schematically in
Figure 5.
The results are the values without parentheses listed in
Table 4. For all but one of the 12 control samples, we measure r
ΣCR,NCR > r
ΣAD,NCR, with differences up to 0.47 for sample 20. Therefore, only in the case of N=18, the sample would have been classified as AD in contrast to the classification. We will elucidate the reasons for the wrong classification below.
Regarding the identification of an AD sample, the situation is less clear. While samples 1, 12, 13, 14, and 16 show rΣAD,NAD > rΣCR,NAD and are thus classified correctly, we observe that rΣAD,NAD is just slightly smaller than rΣCR,NAD in samples 10 and 17, but find dramatic deviations from the classification for samples 4 and 11, with r value differences of 0.68 and 0.42, respectively.
In order to investigate how stably the classification works with respect to multiple t-SNE runs, two further runs were performed, and the classification was carried out as described previously. The AD samples were assigned to the same group in all runs as described above. In two out of three runs, all CR samples except N = 18 were identified as CR samples and in the third run, all were classified as CR.
The failure of allocating samples 4 and 11 asks for refined consideration. Despite their different subtype of AML, the leukemic cells show a monoblastic differentiation reflecting a more “mature” subtype not necessarily reflected by a particular CD34/CD38 subset. Since the cells of these misassigned patients represent a more mature type and the patients had only a molecular relapse, it is very likely that they could not be adequately assigned, since the leukemic cells were not contained within the CD34
+ cell population. For the detection of that kind of subtype additional markers such as CD33 and CD14, for example, would still be necessary. Rather, the antigen markers used should show expression levels quite similar to those of the CR samples. Since it is of great interest to study how such a misallocation influences the t-SNE representation and the corresponding Pearson coefficients, we remove patients 4 and 11 from the ensemble and repeat the quantitative analysis. The modified t-SNE plot in comparison to the plots of these two patients, shown in
Figure 6, illustrates the dissimilarity of the density distributions. The obtained r values are given in
Table 4 in parentheses. First, we notice a striking decrease of r
ΣAD,ΣCR by 0.22. Apparently, these two samples have been responsible for a significant similarity between the two t-SNE representations, again indicating that samples 4 and 11 generate a pattern that resembles more CR samples than AD cases. Second, for all controls samples, the values of r
ΣAD,NCR improve, some of them dramatically, e.g. for patient 9, r drops from 0.24 to -0.05. Third, however, we observe some effect on the r
ΣAD,NAD values, which change by no more than 0.19. It increases only for patients 1 and 13 but decreases for the remaining cases. This impressively shows how the lack of a relevant marker for clear characterization can lead to false similarities and thus impede the classification. It is therefore conceivable that with each additional diagnostically relevant marker the characterization becomes better, the t-SNE image becomes more differentiated and thus the classification becomes more reliable.
As an evaluation of this proposed identification protocol, we note that the values of rΣCR,NCR are large, a fact which quantifies the high similarity of the cell population in the CR stage. They are furthermore significantly larger than rΣAD,NCR and we can thus conclude that the state of CR is safely identified and clearly distinguished from the AD state. Furthermore, it can be characterized by a single number with the t-SNE based protocol, namely by rΣCR,N. The identification of an AD case, however, has remained ambiguous. All values for rΣAD,NAD and rΣCR,NAD are close to zero, with correspondingly small differences which in some cases would even indicate a remission. This situation reflects, in our opinion, the heterogeneity of the considered AD cases. Since these samples generate widely varying t-SNE patterns, they have relatively low r values and if such a reference pattern is compared with a new sample, a t-SNE based identification is ambiguous if not impossible. On the other hand, we have seen at the example of the patients 4 and 11 how the t-SNE based identification can be improved considering blasts of a more “mature” type. We therefore expect that sufficiently differentiated t-SNE reference maps for AD subtypes will also allow the unique identification of an AD as well as its predominant blast population. To be on the safe side, we estimate that t-SNE subtype reference maps should be constructed from at least ten samples.