3.1. Energy-Resolved Mass Spectrometry and the Survival Yield Technique
ER MS experiments were performed by measuring the MS/MS spectra of the cesium cationised peptides at different excitation voltages ranging from 1.7 V to 2.9 V.
Figure 2 shows the MS/MS spectra of the cyclic and linear peptides at 2.5 V. At this excitation voltage the linear peptide is almost completely fragmented, whereas the cyclic peptide is slightly fragmented and the precursor ion peak (at 2141.3 m/z) has the higher intensity. The major fragment of the linear peptide corresponds to the loss of N
2 (at 2112.8 m/z). This fragment is also observed for the cyclic peptide. As there are no specific fragments for the linear peptide, it is not possible to detect the presence of the linear peptide by visual inspection of the MS/MS spectra of the cyclic peptide samples. However, the difference in excitation energy required to fragment both peptides can be used to detect the presence of linear peptide. In this sense, energy resolved mass spectrometry and, more specifically, the survival yield (SY) was used to detect and quantify the relative amount of linear peptide.
The Survival Yield (SY) was calculated at each excitation voltage as the ratio of the precursor ions peak intensity and the Total Ion Current (TIC).[
11,
12,
13,
28,
29,
30,
31]
where
Iprecursor is the intensity of the precursor ions peak and
Ifragment is the intensity of each fragment ions peak obtained from the MS/MS experiment. SY curves were then obtained by plotting SY against the excitation voltage (see
Figure 3).
Figure 3 shows the SY curves of the pure cyclic and linear cesium cationised peptides, which are sigmoidal. The SY curve of the cyclic peptide shows a significant shift towards higher excitation voltages compared to that of the linear peptide. This is consistent with the MS/MS spectra in
Figure 2. The SY curves of the mixtures of cyclic and linear peptides lie between the SY curves of the pure peptides. The position of the SY curves is related to the relative amount of linear and cyclic peptide[
13,
27]. The higher the relative amount of cyclic peptide, the closer the SY curve of the mixture is to the SY curve of the cyclic peptide.
The difference between the SY of the mixtures and the SY of the cyclic peptide was plotted against the molar ratio of linear peptide. This plot was made at an excitation voltage of 2.2 V (orange vertical line in
Figure 3) as we observed the best results to detect lower amounts of linear peptide under these conditions.
Figure 4 shows that two different linear relationships are observed: a linear model for molar ratios of linear peptide between 0 and 0.3, and another linear model with a higher slope for molar ratios from 0.3 to 1. The large difference in sensitivity between the two models is due to matrix effects due to ion suppression, which are commonly observed in electrospray sources in mass spectrometry. It can be clearly observed that the ionisation of linear peptides is suppressed by the presence of cyclic peptides. Therefore, for lower proportions of linear peptides, the slope of the calibration model is much lower. Conversely, for linear peptide ratios exceeding 0.3, ion suppression is less pronounced due to the reduced relative abundance of cyclic peptide. This is evidenced by the slope of the calibration curve, which is approximately 2.5 times higher than for lower ratios, indicating that sensitivity is diminished in the low range of linear peptide contamination.
Univariate and multivariate calibration models were calculated for the lower interval of linear peptide molar ratios (i.e., from 0 to 0.3) because the aim of the study is to detect the presence of small traces of linear precursor in samples of cyclic peptide. For this range, we measured the SY of each mixture three times. For comparative purposes, the linear peptide molar ratios of the mixtures were the same as the ones measured by IR microscopy: 0, 0.05, 0.1, 0.15, 0.20 and 0.30.
Figure 5a shows the univariate calibration model at an excitation voltage of 2.2 V (orange vertical line in
Figure 3). This regression model was obtained by calculating at 2.2 V the difference between the SY of the cyclic peptide and the SY of each calibration standard and by plotting this difference against the linear peptide molar ratio of the calibration standards.
Figure 5b shows the results obtained by multivariate calibration to quantify the molar ratio of linear peptide by applying the Classical Least Squares (CLS) algorithm. In this case, instead of using the SY at only one excitation voltage, the SY curve as a whole was used. The SY curve of a calibration standard can indeed be described as a linear combination of the SY curves of cesium cationised linear and cyclic peptides obtained from pure samples.[
13,
27] A linear combination coefficient (
a) can then be obtained, which minimizes the sum of squares of residuals (
e) of the following expression:
where
SYLinear and
SYCyclic correspond, respectively, to the SY curves of the cesium-cationised linear and cyclic peptides obtained from pure samples (bolded variables describe vectors). Linear coefficients (
a) obtained for each calibration standard were then plotted against the molar ratio of linear peptide to calculate an univariate regression curve. In this way, the linear coefficient obtained by CLS can be related to the molar ratio of linear peptide of the calibration samples. In absence of matrix effects due to ionisation efficiency, the slope of this curve should be close to 1. In this case, the slope is only 0.483, much lower than 1. This is because there is ionisation suppression of the linear peptide in the presence of the cyclic peptide (as it was already observed in
Figure 4 for the interval 0-0.3 of linear peptide molar ratio).
Both calibration models showed good coefficients of determination, R
2, with the multivariate model slightly better. Three of the calibration standards initially prepared to be measured by IR microscopy (at molar ratios of linear peptide of 0.1, 0.2 and 0.3) were also measured by ER MS (red circles in
Figure 5). The IR standards were diluted 100 times in water/methanol (1:1) and CsCl was added at 180 μM in order to be in the same conditions as the calibration standards used for the ER MS calibration models (black squares in
Figure 5). The SY values and the CLS coefficients (
a) obtained for the IR samples are very similar to the ones obtained for the MS samples. These results confirm that the ER MS measurements are reproducible and that the IR calibration standards were correctly prepared.
The limit of detection, LD, was then calculated by using the information from the calibration model and the formula:
where
se corresponds to the standard deviation of the residuals of the calibration line (defined also as standard error) and
b1 to its slope. This approach is recommended in LC-MS methods as it gives conservative estimates when the LD is calculated from calibration graphs.[
41,
42] The factor of 3.3 is associated with a probability of a false positive decision (type I or α-error) of 0.05 and a probability of a false negative decision (type II or β-error) of 0.05.[
43,
44] The LD calculated for the univariate model was 0.053 molar ratio of linear peptide. The LD calculated for the multivariate model was 0.045, slightly better than the univariate model.
The performance of both calibration models was evaluated by calculating the fit error of each calibration standard as the difference between the predicted value of the linear peptide molar ratio, (
nLin//
nTotal)
predicted, and the reference value of the linear peptide molar ratio, (
nLin//
nTotal)
reference. The proportional fit error was calculated as:
Figure 6a shows the fit error of each calibration standard calculated for the univariate and multivariate calibration models. All fit errors were randomly distributed around 0 and less than 0.02 (except for the standard at 0.3).
Figure 6b shows the percentage of fit error for each calibration standard. Both models show similar values of fit error. The percentage of fit errors was less than 20% except for the standard at 0.05 molar ratio of linear peptide. The Root-Mean-Square Error of Calibration (RMSEC) was calculated to obtain an average fit error for each calibration model:
where
n corresponds to the number of calibration standards, i.e., 6*3=18. The RMSEC values calculated for both models were quite similar and slightly better for the multivariate model: for the univariate model it was 0.015 and for the multivariate model it was 0.013.
The intermediate precision was calculated as the pooled variance of all the three replicates measured for each calibration standard.[
44] The intermediate precision (expressed as standard deviation) was 1.71·10
-2 for the univariate model and 1.58·10
-2 for the CLS, showing that precision is slightly improved when the entire SY curve is used to calculate the molar ratio of linear peptides. Although the performance of the linear and multivariate calibration models is quite similar, the multivariate model shows slightly better performance in terms of limit of detection, RMSEC and intermediate precision. However, it is important to note that the multivariate model requires the measurement of the entire SY curve and is more time consuming than the univariate model which only requires the measurement at one excitation voltage.
3.2. Mid-Infrared Microscopy
The pure linear and cyclic peptides were measured by IR microscopy.
Figure 7 shows the superposition of the two IR spectra. Each spectrum was normalised to the mean absorbance and baseline corrected. Both spectra show the same absorption bands except for a wavelength region from 2040 to 2170 cm
-1,which is observed only for the linear peptide. In fact, this band is characteristic of the linear peptide, since it corresponds to the alkyne (
) and/or azide (N
3) functions, which usually appear between 2140-2100 cm
-1 and 2160-2120 cm
-1, respectively.
Several mixtures of the cyclic and linear peptides were measured by IR microscopy. Each mixture was measured in triplicate. For each triplicate, 100 spectra were measured and averaged (see Experimental section).
Figure 8 shows the average spectrum for each triplicate. A total of 33 spectra are shown (i.e., 3 replicates*11 concentration levels).
Figure 8a shows that the spectra are baseline shifted. This is usually the case in IR data due to background variations, mainly due to scattering.
Figure 8b shows the IR spectra corrected by Asymmetric Least Squares.[
45,
46] Asymmetric Least Squares (AsLS) is a spectral baseline correction method that separates the baseline from the signal. AsLS is widely used in spectroscopy and chromatography for accurate baseline correction, effectively distinguishing sharp peaks from smooth baseline components. The advantage of this method is that no prior information about peak shapes or baselines is required. It minimizes an objective function combining residuals and a smoothness penalty. Weights are iteratively adjusted: higher for points below the baseline (assumed to be noise) and lower for points above (assumed to be peaks). Parameters include lambda for smoothness and
p for asymmetry. Both parameters have to be tuned to the data at hand and be chosen by data visualization. In AsLS,
p (for asymmetry) usually varies between 0.1 and 0.001, and lambda (for smoothness) between 10
2 to 10
9.[
45] In our case, the best baseline correction of the IR spectra was obtained with
p=0.001 and λ=1·10
4. These values were used to correct the baseline of all the IR spectra.
Figure 8b shows that the preprocessed spectra were successfully baseline corrected.
The molar ratio of linear peptide was quantified through univariate and multivariate calibration, as illustrated in
Figure 9. The univariate calibration model (
Figure 9a) corresponds to the area of the specific peak against the molar ratio of linear peptide. A high determination coefficient (R
2=0.961) was observed for the univariate model, indicating a strong correlation between the two variables.
For the multivariate model, the PLS (Partial Least Squares) regression method was applied, between the matrix
X (33 x 1725) containing the IR spectra and the vector
y (33 x 1) containing the molar ratio of linear peptide. The model was validated using contiguous-block cross validation, with 11 data splits (3 samples per split). The optimal PLS model had 5 latent variables and had quite good figures of merit (
Figure 9b), with an RMSEC = 0.036 and a Root Mean-Square Error of Cross-Validation, RMSECV = 0.050. The model had practically no bias and the coefficients of determination, R
2, for calibration and validation were 0.987 and 0,974, respectively.
Figure 10 shows the fit errors calculated for the calibration standards with the univariate and PLS models. The PLS model displays superior predictive capabilities in comparison to the univariate model. This is clearly observed in
Figure 10a, which shows that the fit errors of the univariate model are higher than those of the PLS model. The fit errors of the PLS model were less than 0.07 (with the exception of one sample at n
Lin/n
Tot=1), whereas the univariate model exhibited calibration standards with fit errors reaching up to 0.15.
Figure 10b also demonstrates that the PLS model is more predictive in terms of percentage of fit error. For the PLS model, the percentage of fit error of the calibration samples is less than 20%, with the exception of the samples with molar ratios of linear peptide less than 0.15. This is not the case for the univariate model, which shows that almost all the calibration samples are predicted with higher percentages of fit error.
The RMSEC of the univariate model was of 0.062, while the RMSEC of the PLS model was of 0.036. This confirms that the PLS model is more predictive than the univariate model. As for ER MS, the intermediate precision was calculated as the pooled variance of all the three replicates measured for each calibration standard.[
44] The intermediate precision (expressed as standard deviation) was 5.06·10
-2 for the univariate model and 2.63·10
-2 for the PLS model. This shows that precision is clearly improved when the entire IR spectra is used to calculate the molar ratio of linear peptides. The limit of detection of the univariate model (calculated with Eq. 3) was of 0.21. The limit of detection of the multivariate model was calculated as 3.3·RMSEC, which corresponds in multivariate calibration to the equivalent expression of Eq. 3. The limit of detection was 0.12, which is considerably lower than that of the univariate calibration. The PLS model demonstrates superior performance in terms of limit of detection, RMSEC and intermediate precision. It is therefore preferable to consider the full IR spectrum rather than focusing on the specific peak of the linear peptide.