Introduction
Overview of MRI and NMR
Magnetic Resonance Imaging (MRI) is widely used in clinics for identifying diseases and disorders in the human body, offering high accuracy and safety compared to X-rays and other invasive radiation techniques. MRI-related research has been ongoing since the 1970s 1–4. Additionally, various magnetic imaging techniques have emerged, finding applications in multiple research fields and clinical settings. It is important to note that MRI, along with its new formats, fundamentally relies on Nuclear Magnetic Resonance (NMR) principles 2.
NMR Data Acquisition
In NMR, 1D 1H NMR is the most commonly used method for data acquisition. This technique generates one-dimensional NMR data, representing proton signals in the time domain. When a sample is exposed to radiofrequency (RF) radiation from an NMR spectrometer, one of the protons in the sample—think of it like tiny magnets—absorbs energy from the RF radiation. This energy is manifested in the form of vibrations, which causes the proton to 'vibrate' or 'spin' at a particular speed, referred to as its 'frequency.'
When the RF field is turned off, the proton returns to its original state, ceases vibrating, and releases the energy it absorbed. This emitted energy is what we detect and record as a signal. Each signal has its own unique frequency, allowing us to distinguish it from other signals.
The need for pre-processing in NMR data
Since time domain NMR data combines signals from the same sample, several pre-processing steps are necessary to identify the signals present before conducting data analysis. These pre-processing steps are performed in both the time domain and the transformed domain using mathematical transformations like the Fourier transform (FT). The FT separates the data by frequency and integrates it over time, generating NMR data in the frequency domain. This data can then be displayed as a spectrum, with each signal represented as a peak or a multiplet (a group of overlapping peaks).
Overview of the review
While our previous review focused on time domain pre-processing steps (
https://www.preprints.org/manuscript/202310.2032/v1), this review delves into frequency domain pre-processing, shedding light on the remaining aspects of the NMR pre-processing pipeline. The following sections will be covered:
- (1)
Identification of influential frequency domain pre-processing steps
- (2)
Software products for frequency domain pre-processing
- (3)
Rationales, algorithms, general comments, and suggestions for each pre-processing step
- (4)
Summary and conclusions
By examining each of these sections, we aim to provide a comprehensive understanding of pre-processing for NMR frequency domain data.
Brief overview of pre-processing steps in the NMR frequency domain
In the following sub-sections, we provide a list of the most influential frequency domain pre-processing steps, although their order may vary in different software.
Phase error correction
Phase in NMR signals tells us about the position or timing of the signal. Phase error is when what we measure doesn't match the true timing. Correcting phase errors is crucial because they can seriously distort the data, particularly in the frequency domain.
Baseline correction
Uncorrected or poorly corrected direct current (DC) offsets in the time domain can bias intensities away from a flat baseline in the frequency domain. These baseline biases, which can also result from phase errors, need to be corrected.
Solvent filtering
Eddy currents are circular electric currents within materials that can distort NMR signals by affecting magnetic fields. A prominent, intense solvent peak often captures most of this distortion, and it is typically removed during data analysis.
Calibration and alignment
To make NMR spectra comparable across different spectrometers, frequencies are expressed in parts per million (ppm) using the ratio of a signal's frequency to the spectrometer's frequency. Calibration sets the internal reference signal's ppm to zero by shifting the entire spectrum, while alignment adjusts peak chemical shifts to ensure the same signal aligns at the same ppm in different spectra.
Reference deconvolution
In reference deconvolution, the internal reference signal is reshaped to resemble a Lorentzian line, a specific mathematical function that describes the shape of an ideal peak. This transformation is also applied to all other signals to standardize their appearance as Lorentzian lines.
Binning/bucketing and peak picking
Binning or bucketing involves dividing a spectrum into fixed-width ranges. In contrast, peak picking, also known as intelligent binning, first identifies peaks and then groups them into ranges.
Peak fitting/deconvolution and compound identification
In this step, our goal is to identify molecules from the signals or "peaks" in the data. To achieve this, we employ two processes. Peak fitting involves precisely defining each peak's characteristics, such as shape, location, and intensity. Deconvolution, on the other hand, untangles overlapping peaks, allowing the separation of different molecules' contributions.
Integration and quantification
We use a process called integration to sum the 'intensities' within a specific range of each peak. This aids in quantification, where we determine concentration based on the peak's area under a curve, allowing us to assess the concentration of each molecule in the sample.
Normalization and transformation
Normalization is a technique used to standardize data for comparability, while transformation aligns data with specific necessary statistical assumptions.
The above nine frequency domain pre-processing steps are the most influential in current NMR practice. In the next section, we will present commonly used software products that provide these pre-processing steps.
Understanding NMR pre-processing steps in the frequency domain
In this section, we will delve deeper into the pre-processing steps in the frequency domain. Our aim is to demystify each NMR frequency pre-processing step by providing a detailed explanation and suggestions.
Understanding and correcting phase errors
Raw NMR signals in the time domain are complex numbers representing nuclei's energy changes along two orthogonal directions. After transforming into the frequency domain, they are still in complex numbers, with the real part called absorption and the imaginary part called dispersion.
Figure 1A shows simulated absorption values with three sharp, concentrated peaks in an absorption spectrum.
Figure 1B displays simulated dispersion values with a different pattern. Phase, calculated as
Phase, indicates the relationship between absorption and dispersion, and its corresponding plot is shown in
Figure 1C. Figures 1A-C represent ideal signals with no phase errors, thus,
Figure 1D shows a phase error plot with all values at 0.
However, NMR data always contain phase errors, which could cause significant change in all absorption, dispersion, phase and phase error plots as shown in Figures 1E-H. Therefore, data analysis based on peaks’ locations and areas under curves in the absorption plot (
Figure 1E) is not reliable since it is so different from true peaks without phase error (
Figure 1A). Therefore, phase error correction has to be done before any data analysis.
Unfortunately, phase error correction is one of the most challenging pre-processing steps in the frequency domain. Current NMR phase error correction approaches mainly rely on a simple linear model applied to the entire spectrum
13. This model searches for the intercept (zero-order parameter) and slope (first-order parameter) through an optimization process. Different algorithms employ various optimization functions
14–28. This simple linear model approach cannot deal with non-linear phase errors shown in
Figure 1H. Although a more complex model with high-order terms might yield slightly better results
22,23, both models struggle to correct all phase errors. This is why manual phase error correction is still employed in recent research
29–31. However, manual phase error correction heavily relies on individual experience, leading to inconsistencies and a lack of inter-user reliability.
The key issue with current phase error correction approaches is the application of a single model to the entire spectrum. This strategy assumes that phase errors are only correlated to frequency or related scales such as index. However, this simple model cannot deal with non-linear phase errors, which are related to not only frequency but also signal peak area. Since each signal peak has its own frequency and peak area, it is reasonable to assume that phase errors have higher correlation within a signal than among signals. Therefore, we propose the use of a mixed model instead of a fixed linear model, while still utilizing all optimization functions:
: a phase value to be added to make phase correction at index i that belongs to a given signal j
: a global fixed effect intercept
: a random intercept for the jth signal
a fixed slope of an index function for the jth signal
a pre-defined function that depends on index i and belongs to the jth signal, which could be linear or nonlinear.
: a random error at the ith index that belongs to the jth signal
Alternatively, if achieving convergence with a mixed model proves challenging, we can consider using one linear phase error correction model for each signal separately.
A more efficient way to deal with all types of phase errors, whether constant, linear, or non-linear, is to start with phase error-free data such as magnitude and power spectra, which in theory should not contain any phase errors. Then, derive the phase error-free absorption spectrum as shown in
Figure 1A.
Comment: Phase error correction is crucial in NMR data pre-processing. The current single fixed model often fails to correct all peaks. We propose a linear mixed model for better correction, considering spatial correlations within each signal. Alternatively, you can use a linear model for each signal. The most efficient approach, however, is deriving phase error-free absorption spectra from magnitude and power spectra.
Baseline correction techniques
Baseline distortion refers to a non-flat and non-zero baseline, primarily caused by uncorrected DC (direct current) offsets and phase errors. Baseline correction involves estimating the baseline bias and subtracting it from the spectrum data. Various algorithms exist for estimating baseline bias, and here are some examples:
1). Iterative polynomial fitting 13,30,32,33
2). Robust estimation procedure 32
3). Locally weighted scatter plot smoothing 32
4). Asymmetric least squares regression with penalized least square approach 32
5). B-spline fixed or mixed model with or without penalization 32
6). Continuous wavelet transform 34
Baseline bias estimation in all these algorithms is based on regions without signals 13. Most baseline correction methods are automated, although manual baseline correction methods also exist 35.
Comment: Baseline bias primarily results from DC offsets and phase errors, which should be addressed in earlier processing steps. If these issues remain uncorrected, baseline correction may become necessary. However, it's essential to note that baseline correction itself can introduce distortion and bias to the data as it is intertwined with noise modelling. Additionally, baseline correction can modify the intensity variance.
Let's denote
s as the intensity variable of a spectrum and
b as the baseline variable. After baseline correction, the intensity and its variance become
s-
z and
var(
s-b) respectively. Furthermore, we have the relationship:
In the simplest scenario where
cov(
s,
b) = 0, we find:
This indicates that the baseline correction process actually increases the variance of the intensity when the baseline is not constant and not strongly correlated with intensity.
Even when baseline correction is performed with a constant value
c, if you require the ratio of a signal's area (
x-
c) to an internal reference's area (
y-
c) for quantification, it's crucial to be aware that the ratio (
x-
c)/(
y-
c) might be significantly altered and may not correlate with the original ratio (
x/
y) at all (see
Figure 2A: after baseline correction and
Figure 2B: before baseline correction with a shift in the ratio). Therefore, caution is warranted when utilizing baseline-corrected data for quantification.
Solvent filtering methods
In cases where eddy current effects remain uncorrected in the time domain, and phase error correction does not rectify the resulting distortion, solvent peaks can become severely distorted. To address this issue, several commonly employed methods for solvent filtering are as follows:
1). Subtracting a solvent-only time domain data, known as the Free Induction Decay (FID), from the experimental FID of the biological or chemical sample. Subsequently, the solvent-filtered FID is transformed into the frequency domain, serving as one strategy for eddy current correction in the time domain.
2). Creating a pseudo solvent-only Free Induction Decay (FID) involves isolating the data within the solvent peak range from the frequency spectrum, setting all other data points to zero, and then transforming it into the time domain. This pseudo FID can then be processed using the method described in 1) 36.
3). Employing specialized filters that target the solvent's frequency range to eliminate the solvent signal 36.
4). Integrating solvent peak removal with baseline correction in the frequency domain 36
5). Zeroing out data points within the solvent peak data range or setting them to baseline values to effectively remove the solvent peaks.
6). Utilizing wavelet transformation as a means to remove the solvent signal 37.
Comment: If solvent peak distortion is not addressed during eddy current correction in the time domain and phase error correction in the frequency domain, it becomes necessary to filter out solvent peaks for further analysis. However, it is important to note that filtering solvent peaks may also inadvertently remove some true signals from their neighbouring components.
Methods and considerations for calibration and alignment
Calibration is to shift a whole spectrum to make an internal reference peak at 0 ppm, which is also called global alignment 13. Opposite to calibration, (local) alignment is to align each peak across a group of spectra to the same ppm position 13. Calibration can be applied without alignment; however, alignment should not be applied before calibration.
The following are example methods for alignment:
1). Internal correlated shifting with icoshift 38
2). Correlation optimized warping 39
3). Peak alignment by beam search 40
4). Fuzzy warping (Wu et al.) 41
5). Hierarchical cluster based peak alignment 40
More alignment methods can be found in the alignment review article 40.
During the alignment process, distance between two neighbour peaks might be increased or decreased, the latter could affect peak areas and quantification 32. Therefore, it was suggested that quantification should be processed on unaligned spectra to avoid this possible problem 32.
Comment: We recommend calibration as long as internal reference is phased well but not the (local) alignment.
Improving spectral lineshape: reference deconvolution techniques
Reference deconvolution involves using a reference signal to remove lineshape distortion from a whole spectrum. One commonly used approach for reference deconvolution is called FIDDLE: Free Induction Decay Deconvolution for Lineshape Enhancement 42–44. The process consists of the following sub-steps:
- (1)
Select reference signal data points from a frequency spectrum and set the rest of the data points to 0, creating a reference-only spectrum (Aref).
- (2)
Transform Aref into the time domain to obtain the reference-only FID (FIDref).
- (3)
Use simulation to deconvolve FIDref and obtain an ideal reference FID with a Lorentzian lineshape (FIDideal_ref).
- (4)
Calculate a ratio variable for each time point in the time domain:
- (5)
Multiply each time point of the original FID by the corresponding ratio variable to obtain the corrected whole FID:
- (6)
Transform FIDcorrected back into the frequency domain to obtain a reference-deconvoluted whole spectrum with an ideal lineshape.
The FIDDLE approach assumes that all signals are distorted in the same proportions as the internal reference signal and all signals have perfect Lorentzian lineshapes. However, these assumptions may not hold true in practice. Additionally, the FIDDLE approach does not work well for multiplets and 2D NMR data.
For 2D NMR data, a commonly used method that deals with a group of spectra is employed. This method not only unifies peaks with the same lineshape but also aligns peaks across spectra 45. This procedure differs from traditional reference deconvolution as it does not rely on an internal reference. Instead, it utilizes a "reference spectrum," often referred to as the "average spectrum," which is estimated using Principal Component Analysis (PCA) from multiple spectra. For each peak, the first principal component (PC1) is calculated and used to represent the peak in the "average spectrum." Ultimately, peaks are aligned across spectra, and their phase values match those in the PC1 "average spectrum."
Although this PCA-based method does not require a Lorentzian lineshape, the alignment process can lead to discontinuous baselines and distort overlapping peaks 46. Additionally, a strong assumption is necessary to set the same phases for peaks across spectra.
Some researchers have combined the FIDDLE and PCA-based methods 46, replacing the Lorentzian lineshape in FIDDLE with the average reference peak lineshape derived from PCA. This combined approach works on groups of spectra in 2D NMR data simultaneously. However, it still assumes that a group of aligned peaks across spectra in the same experiment share the same peak shape and location 46. While this assumption may hold well for Diffusion Ordered NMR Spectroscopy (DOSY) data, it is not suitable for general NMR data.
Interestingly, based on the aforementioned algorithms, we can observe that reference deconvolution primarily addresses lineshape distortion caused by phase errors, which are not corrected in the phase error correction step.
Comment: While this step makes all signals' lineshape in a spectrum serve as the reference signal and/or enforces lineshape comparability for a given signal across a group of spectra, it modifies the data with strong assumptions and can unduly influence all data analysis. Therefore, we recommend against applying reference deconvolution.
Binning, peak picking, and intelligent binning
This section focuses on the process of dividing spectra through binning or bucketing techniques and peak-picking methods.
While fixed width binning is commonly used, it has several potential problems that need to be addressed:
- (1)
Signals can be split into multiple bins or combined into a single bin, resulting in non-meaningful bin summary data 32.
- (2)
Fixed binning is not effective for handling overlapping peaks.
- (3)
Bins are not comparable across spectra if alignment issues exist before binning 47.
Intelligent binning, also known as artificial intelligence (AI) binning, offers an automatic peak picking approach that generates more meaningful divided ranges compared to fixed width bins 48. AI binning utilizes techniques such as wavelet transformations, dynamic algorithms, and Gaussian or exponential functions to detect peak edges 32,34,49. This process can be applied to a group of spectra, allowing for small ppm adjustments across spectra 50. In cases where complex computations hinder the application of AI binning to the entire spectrum, it can be applied to each bin after fixed binning, using segments that cover the widest peaks 34.
One of the challenges in AI binning is peak screening or filtering, which involves differentiating between true signal peaks and noise. This step requires the definition of thresholds, taking into account factors such as signal-to-noise ratio and variance. Prior knowledge and manual intervention may also be necessary for effective peak screening 48.
Comment: Choosing peak picking or intelligent binning over fixed width binning is highly recommended, provided that careful consideration is given to selecting appropriate thresholds for peak filtering.
Peak fitting/deconvolution, and compound identification
The goal of this step is to deconvolute overlapping peaks when needed and identify compounds based on libraries that contain standard spectra with known signals or predict new compounds. Examples of libraries are Human Metabolomics Database (
www.hmdb.ca) and Biological Magnetic Resonance Databank (
www.bmrb.wisc.edu).
An example of a simple overlapping peak occurs when two molecules share the same peak. Assuming a Lorentzian function for each molecule's contribution in the absence of overlaps, deconvolution can be accomplished through an optimization process involving a specific loss function, such as the sum of the squares of the differences between the observed and estimated peaks 49. Deconvolution represents one of the most challenging pre-processing steps in the frequency domain, and it involves the application of various statistical models, including linear models and Bayesian hierarchical models 33,51.
Comment: It's important to acknowledge the complexity of deconvolution and its potential challenges. The subsequent compound identification phase may prove even more demanding. This is because experimental NMR data might undergo processing at different times, be handled by different individuals, or even originate from distinct laboratories compared to the existing NMR library data. As a result, direct comparability may be compromised, and we should remain vigilant about potential issues stemming from this NMR data-library data incompatibility during this step.
Integration and quantification
This step, relying on raw spectra or signals/peaks data, yields a matched compound list with concentrations in a spectrum 30. Integrating signals is conceptually straightforward, but the real challenge lies in intelligently defining signal edges, a task initiated in the peak picking and peak fitting/deconvolution steps. An arbitrary suggestion has been made to use a range of 24 times the signal width for integration 52, which may include unintended signals.
Quantification relies on factors such as a signal's area, the number of nuclei it represents (e.g., protons), a reference signal's area, the reference's number of nuclei, and, notably, the reference's concentration in the specimen. In the absence of an internal reference concentration, alternative methods include external references or electronic references 52. An internal reference generally offers more accurate concentration estimations, unless it interacts with other signals, leading to inaccuracies 48. When access to area is challenging, as in 13C NMR spectra, height may be used instead.
When multiple signals contribute to the concentration estimation of a compound, the choice can be to select the most stable and isolated peak or to calculate the mean value from multiple signals. In cases of multiple technical replicates, concentration estimation should be based on the mean value across these spectra 30.
At times, rather than using raw spectrum peaks (
Figure 3A), Lorentzian lineshape fits (
Figure 3B) are employed for quantification
45,46. While fit lines are symmetric and free from random errors, they deviate from observed spectrum data (
Figure 3C), resulting in inaccurate peak areas and compound concentrations. Additionally, if the research goal is to identify significantly different peaks between two groups of spectra, using "error-free" numbers can reduce or underestimate variance between groups and increase the false positive rate.
Comment: Integration is a straightforward summation process, provided that peak edges have been intelligently defined in previous steps. We advise against using arbitrary multipliers of signal width for integration. The estimation of a molecule's concentration should ideally rely on an internal reference with a known concentration. Using fitted lines for quantification is discouraged to avoid bias and the underestimation of variation. While fitting is valuable for deconvolution and identification, once we have the proportions and edges of overlapping signals, it's best to derive integrals from the raw spectrum itself.
Normalization and transformation
This step aims to make data comparable or suitable for the assumptions needed in subsequent statistical analysis.
Normalization
Normalization is to make data comparable, which can be classified into spectrum-wise and location-wise normalization and can involve various approaches.
Spectrum-wise normalization
To make a group of spectra comparable, spectrum-wise normalization methods include dividing a variable, such as peak area, across various locations by the spectrum's total area, scaled by a predefined constant 48. However, this method hinges on the assumption that all spectra possess the same total signal quantity, an assumption that might not hold true in practice. An alternative spectrum-wise normalization method is to employ the area of an internal reference as a normalizing factor 32. In the case of NMR data being binned, the bin area can be utilized as a substitute for the peak area. Among these techniques, reference-based normalization, which relies on the presence of a spike-in internal reference with a known concentration, emerges as the most robust choice.
In addition to these methods, other less commonly used spectrum-wise normalization techniques also exist. Distribution-based normalization strategies like quantile normalization 54–56, histogram (matching) normalization 32, and spline normalization align data distributions. Quantile normalization orders and transforms values across spectra to achieve uniform distributions. Histogram normalization scales data based on minimum and maximum values from a reference spectrum 57. If you do not have a reference spectrum, the average or median spectrum across a group of spectra should work as well for histogram normalization. Spline normalization fits quantiles from experimental and reference spectra to a smooth cubic spline, which is then used to generate normalized data for the experimental spectrum 55,58. Similarly, the cubic spline can be replaced by Lowess to make Lowess normalization.
Location-wise normalization
Location-wise normalization is to make a variable at the different locations comparable. Although all methods in this section can also be applied to Section 3.4.9.1.1 Spectrum-wise normalization, methods in Section 3.4.9.1.1. are generally not applicable in the current section.
The simplest location-wise normalization is variable centering with variable mean or median 32. This is to subtract a variable’s mean or median across spectra for the same location and then add a pre-defined constant, which results in the same mean or median among different locations.
If we divide a variable with its mean at the same location across spectra, we will have level scaling 13,32. We can also divide a variable with its standard deviation at the same location across spectra to get unit variance scaling (auto-scaling), which is to make values at the different locations to have the same unit standard deviation. Vast scaling is to add one more step to unit variance scaling, which is to multiple unit variance scaled data with CV (coefficient of variation) that is variable mean over its standard deviation 32,59. Instead of standard deviation in unit variance scaling, square root of the standard deviation might also be used, which results in Pareto scaling. When range (maximum – minimum values) instead of standard deviation is used, the normalization method is called range scaling.
The most traditional normalization is the standardization, which is to minus mean first and then divide by standard deviation. However, in order to make all data to be positive, direct standardization is not applicable to NMR data, although its variation might work, for example, a procedure to minus mean, add a pre-defined constant, then divide by standard deviation.
Transformation
Transformation is to make data more suitable to assumptions required by a statistical method. It is applied to each item of a variable of NMR data.
The most commonly used transformation is log transformation, which is to increase normality and remove heteroscedasticity 32. Note that log transformation cannot be applied to non-positive numbers and it is a nonlinear transformation that might cause noise amplification. G-log transformation is generalized log transformation using the formula: ), or its variants 60–62. High values are still logarithm transformed without much change from regular log transformation while low values or noise are specially transformed to avoid noise amplification problems. This requires prior knowledge of high and low value thresholds to set the parameter λ in the G-log transformation formula 13,48.
Box-Cox transformation is to search for the best power transformation to get the optimal normalizing transformation 63. Its aim is also to decrease the effect of non-normality and remove heteroscedasticity 32.
Comment: Except for internal reference-based area spectrum-wise normalization, all other methods serve for statistical analysis, not molecule quantification. Different normalization methods can significantly impact results 59. Furthermore, location-wise normalization can inadvertently amplify noise or pseudo peaks, and it's essential to note that once data transformation occurs, reverting to the original scale for subsequent analyses is not straightforward.
Discussions and conclusion
In previous sections, we've explored the pre-processing steps in
Table 1.
Table 2 summarizes our key points. Now, let's delve further into these steps.
Phase error correction: Phase errors are common in NMR data and can have a substantial impact on results. Correcting these errors is undeniably necessary. However, conventional approaches often employ a single model for the entire spectrum, which may leave certain regions uncorrected. To enhance accuracy, we recommend implementing a linear mixed model for phase error correction, exploring multiple models for different signals, or deriving desired spectra from phase error-free magnitude and power spectra.
Baseline correction: Correcting baseline bias can be challenging, especially when significant phase errors remain. Additionally, baseline correction can complicate quantification. Our suggestion is to eliminate direct current offsets in the time domain, ensure thorough phase error correction, and, if possible, skip baseline correction.
Solvent filtering: Solvent filtering deals with distorted solvent peaks caused by the eddy current effect. If solvent peaks can be phased effectively, this step may be skipped. However, if phasing is problematic, removing solvent peaks is preferable, despite the need for vigilance regarding neighbouring peaks.
Calibration vs. alignment: We recommend calibration over alignment since alignment may affect peak areas and subsequent quantification.
Reference deconvolution: Reference deconvolution aims to remove lineshape distortion under the assumption that all peak distortions follow the same pattern as reference distortion. We find this assumption too stringent and advise against reference deconvolution.
Binning/bucketing: Binning is suitable for multivariate data analysis like pattern discovery. However, it's not ideal for molecule identification and quantification. In such cases, peak picking or intelligent binning is a better choice.
Peak deconvolution: Peak deconvolution aids in separating overlapping peaks for compound identification. Peak fitting, using theoretical lineshapes such as Lorentzian, is part of this process. While some researchers use peak fitting lines for quantification, we recommend using Lorentzian-fitted lines to obtain relative ratios for compounds from overlapping peaks and then estimate areas under curves from non-fit spectra for these compounds.
Normalization and transformation: These processes make data comparable and suitable for the required statistical assumptions, particularly beneficial for multivariate data analysis like pattern recognition. However, only internal reference area-based spectrum-wise normalized data is suitable for molecule quantification.
In summary, phase error correction and peak deconvolution are among the most challenging pre-processing steps in the frequency domain. Our overarching philosophy in pre-processing NMR data is to respect the data as much as possible. While each of these steps has its place in NMR practice, not all are suitable for every situation. It's essential to strike a balance between noise reduction and the preservation of valuable biological information. Our approach acknowledges the presence of some noise in processed data, which can be addressed during subsequent statistical analysis, ensuring that meaningful signals are retained. This balanced approach is preferable to overly processed data that risks losing vital information.