Low-Cost Raman Spectroscopy Setup Combined with a Machine Learning Model for Point-of-Care Applications

Catarina Domingos; Alessandro Fantoni; Miguel Fernandes; Jorge Fidalgo; Sofia Azeredo Pereira

doi:10.20944/preprints202412.0609.v1

Submitted:

06 December 2024

Posted:

06 December 2024

You are already at the latest version

Abstract

The diagnosis of kidney diseases presents significant challenges, including the reliance on variable and unstable biomarkers and the necessity for laboratory tests, which is often expensive and complex. Raman spectroscopy emerges as a promising technique for detecting biomarkers of kidney disease, however, its complexity, high cost and limited accessibility outside clinical con-texts complicates its application. Moreover, analyzing Raman spectra, especially from biological fluids like urine, is a challenging and intensive task. In response to these challenges, the devel-opment of a portable, simplified and low-cost Raman system offers a practical solution for analysis of complex biological fluids. The methodology adopted for the system’s development was based on the ‘Starter Edition’ from the OpenRAMAN website. The study of urine fluorescence was an essential step to determine the appropriate laser wavelength for the acquisition of urine spectra, to minimize fluorescence interference. The system’s optimization involved two stages: adjusting the laser’s operating temperature, by evaluating its emission spectrum under different temperatures with a spectrometer ; and optimizing the acquisition parameters of the software used, through the acquisition of ethanol spectrum to identify the settings that improve spectral quality. The system validation was performed through the acquisition of Raman spectra from five different urine samples, demonstrating its consistency and sensitivity to composition variations in urine samples. Finally, a neural network was designed and trained using methanol and ethanol solutions. The model’s hyperparameters were optimized to maximize its precision and accuracy. This approach explored the model’s potential for classifying Raman spectra.

Keywords:

Sensor

;

point of care

;

Raman spectroscopy

;

instrumentation

;

diagnosis

;

kidney disease

Subject:

Engineering - Electrical and Electronic Engineering

1. Introduction

Raman spectroscopy has become an increasingly valuable and comprehensive tool. This technique stands out for not requiring any prior sample preparation; compatibility with complex and aqueous samples, since the Raman signal produced by water is relatively weak, not interfering with the signal of the other components; and due to its non-destructive nature, it can be used as a diagnostic technique and in situ measurements. Thus, Raman spectroscopy allows a complete characterization of samples, including complex biological matrices, like urine, consolidating the relevant role of this technology in clinical analysis and in the field of biomedicine [1].

Urine analysis using Raman spectroscopy offers several advantages over conventional methods for detecting kidney diseases. Serum creatinine (sCr) levels and the volume of urine production, known as diuresis, are commonly used for the diagnosis of Acute Kidney Injury (AKI), however, these have limitations. Monitoring serum creatinine is a biomarker with poor performance in detecting AKI, since creatinine levels only increase when around half of baseline renal function has been lost. At this stage, AKI has already progressed to Chronic Kidney Disease (CKD) and the diagnosis is considered late. In addition, creatinine concentration fluctuates with muscle metabolism and with variations in extracellular volume, two very instable parameters in people with AKI. Similarly, urine production is influenced by several factors including the volume of fluids ingested and the use of diuretic medications [2,3,4].

Therefore, it is crucial to develop new diagnostic approaches that allow the early detection of kidney diseases, aiming a rapid intervention and improving the prognosis of the condition. Raman spectroscopy reduces the time required for evaluation and provides comprehensive information on the urine constituents, enabling early detection of molecular markers associated with kidney diseases, all using only a small amount of sample and reducing the need for exhaustive laboratory tests [5].

The central hypothesis we intend to evaluate is the possibility of developing a Point of Care system to perform a fast, accurate and affordable diagnosis of kidney diseases, using the unique characteristics of Raman spectroscopy. If validated, this system could make a significant contribution to improving clinical outcomes and reducing costs associated with the treatment of complications arising from these diseases. It could also make the diagnosis of these conditions more accessible and portable, suitable for use in both clinical and emergency settings, where complex laboratory analyses are unfeasible.

Due to the low intensity of the Raman light scattered by molecules, one of the main challenges of this technique is the proper detection of its signal and obtaining spectra with analyzable peaks. The Raman system must be able to eliminate the intense Rayleigh radiation while amplifying the weak Raman radiation. To achieve this, the excitation radiation needs to have high power, be as monochromatic as possible and have high coherence and stability [6]. Currently, lasers are considered the ideal radiation source for Raman spectroscopy experiments, as they emit highly collimated and directional beams, with high power and coherence, both spatially and temporally [7].

The fluorescence emitted by the sample, or by the impurities it may contain, is one of the aspects that requires particular attention in Raman spectroscopy. If the analyzed material interacts with the incident radiation and emits fluorescence, the bands associated with this type of radiation can interfere with the sample's Raman spectrum. Fluorescence is significantly more intense than Raman radiation, resulting in high-intensity bands that can contaminate the acquired spectrum and complicate the detection of Raman peaks. To minimize fluorescence interference, a preliminary study should be carried out to analyze the sample's fluorescence response to various wavelengths. This approach allows the identification of a spectral region with minimal fluorescence emission, which corresponds to the optimal wavelength to that sample [8].

The laser's emission range, also referred to as its spectral amplitude, also plays an important role in the spectral resolution of the Raman spectroscopy technique. When the laser's emission range covers a wide range of wavelengths, the Raman spectral resolution is compromised, making it difficult to interpret the results and correctly assign the vibrational modes of the molecules present in the sample. Therefore, the laser chosen needs to have a wavelength that minimizes the fluorescence emitted by the sample and to be as monochromatic and stable as possible.

Raman spectrum analysis can quickly become exhaustive due to its complex nature and large amount of information they contain, taking the risk of losing important data and underseeing subtle biological markers concentration variations. For this reason, automating the process of urine spectra interpretation appears to be a viable solution for their effective analysis and classification. A supervised learning model, based on a neural network, could recognize characteristic patterns in these spectra, aiding the diagnosis of the patient with kidney disease. If the supervised model is correctly trained, with sufficient data and information, it can reduce the analysis time, ensure greater consistency in the results and even identify new correlations between urine Raman peaks and the existence of kidney disease.

2. Raman System Development

This work is part of a project, whose future goal is the development of a portable optoelectronic system to determine the patient's renal status. This system will have three approaches to urine analysis, the present article focuses exclusively on the Raman spectroscopy part [9].

The methodology adopted for the development of the Raman system was based on the modular Starter Edition, available on the OpenRAMAN website, with the necessary adaptations to meet the specific needs of this project. The Starter Edition spectrometer was chosen because it is the simplest and most economical edition, suitable to achieve the low-cost criterion of the system we intend to build [10].

2.1. Description of System Components

For simplicity’s sake, the system will be divided into its main components: radiation source, optical elements and detector. In addition, we will refer to the data collection software used. All the others important components can be found on the OpenRAMAN site.

Most of the components used were purchased from the supplier Thorlabs. However, some parts of the system needed to be modified or designed from scratch to properly fit the specifics of the system.

In the following figure we present the optical layout of the system developed.

Figure 1. Overview of the developed system, adapted from [9].

2.1.1. Radiation Source

The main factors to take into consideration in laser selection are wavelength and emission range. To identify the optimal wavelength for Raman analysis, urine’s fluorescence emission was quantified under irradiation at various wavelengths, using the FP-8300 spectrofluorometer, from JASCO Corp. One sample of urine was irradiated with three wavelengths, 405 nm, 532 nm and 635 nm. These wavelengths were chosen to evaluate the urine’s fluorescent response across a wide range.

The frequency of the incident radiation is the parameter that most significantly impacts Raman intensity. Higher radiation frequencies result in greater intensity of Raman scattered radiation from the sample. Since frequency is inversely proportional to wavelength, shorter wavelengths lead to an increase in Raman intensity [6,11]. That said, larger wavelengths than 635 nm weren’t evaluated.

The urine’s fluorescence response at each wavelength is presented in the following figure, with the x-axis representing the wavelength range and the y-axis the fluorescence intensity.

Figure 2. Urine fluorescence spectra at 3 excitation wavelengths [9].

As shown, urine exhibits a significant fluorescence when excited at 405 nm. Consequently, a laser with this wavelength is not suitable for Raman measurements due to extensive peak overlap and interference with the Raman signals. At 532 nm, the fluorescence intensity is lower, starting at 450 nm and rapidly decreasing after 532 nm. Despite the considerable fluorescence, Raman signals are stronger when radiation sources with shorter wavelengths are used. In contrast, excitation at 635 nm minimizes the fluorescence emitted but reduces Raman signal intensity, making the spectral analysis more complex, especially using low-power lasers.

Considering the advantages and disadvantages of each wavelength, the 532 nm wavelength offers the optimal balance between minimizing fluorescence interference and maximizing Raman signal intensity. Techniques such as spectral filtering and signal processing will be employed to further reduce fluorescence effects. Several commercially available 532 nm lasers have narrow emission ranges and high power, but they can be highly expensive, exceeding the budget proposed of the low-cost sensor we intend to develop.

Considering this analysis, the CPS532 compact low-power laser from Thorlabs was selected as the most cost-effective option for the Raman system. Its compact design and small size make it an ideal candidate for integration into portable systems. Operable within a temperature range of 10°C to 40°C, it has a typical power of 4.5 mW.

Due to heat dissipation during operation, the laser can reach high temperatures, potentially compromising its stability and spectral resolution. Considering this, a temperature control system was incorporated to the laser, to insure consistent operating conditions, guaranteeing laser’s stability, durability and spectra stability.

A custom support structure to accommodate the CPS532 laser and its control system was fabricated using 3D printing/milling processes. A Peltier element in the setup ensures the heat transfer between the laser module and a dissipator.

Figure 3. Detail of the support manufactured to accommodate the laser CPS532 and the temperature control system.

2.1.2. Optical Elements

This system includes a series of filters and mirrors to achieve high-quality and high-resolution Raman spectra. All optical elements were purchased from Thorlabs.

The green radiation emitted by the laser begins its optical path by striking the directional mirror, model PF10-03-G01, which directs the beam towards the dichroic mirror. The dichroic mirror acts as a wavelength selective filter, transmitting and reflecting light according to its wavelength. For this system, the dichroic mirror chosen was DMLP550, with a cut-off wavelength of 550 nm. This mirror reflects radiation below 550 nm, ensuring that the 532 nm laser radiation reaches the sample, and transmits all radiation above 550 nm. Both mirrors are mounted on KM100 kinematic brackets, allowing precise alignment of the laser beam with the sample and the spectrometer’s entrance.

Before reaching the sample, the laser beam passes through a lens integrated into the cuvette holder. This lens focuses the beam onto a small area of the cuvette, concentrating all the radiation on a specific region of the sample and works also as a collimator for the Raman radiation exiting the sample.

Upon interacting with the sample, both Raman and non-Raman scattered radiation are emitted in all directions. Portion of this scattered radiation is captured by the lens of the cuvette holder and collimated towards the dichroic mirror. The Raman scattered radiation, with wavelengths in the yellow-red range (above 550 nm), passes through the dichroic mirror and is directed to the edge-pass filter.

The edge-pass filter, model FELH0550, removes any residual non-Raman radiation, with wavelengths above 550 nm, ensuring that only Raman radiation proceeds to the detector. The cutoff wavelength of the filter is precisely adjusted to extend the Raman spectrum while avoiding the excitation wavelength by slightly tilting it.

As the radiation passes through the dichroic mirror and edge-pass filter, undergoes some horizontal displacement due to the thickness of these components. To correct this, the beam path is corrected using the WG11050-A compensation window.

After the realignment, the radiation passes through a set of lenses and a slit. The first achromatic lens, AC127-019-A, collects the Raman radiation focusing it on an S50K slit with a 50 μm aperture, mounted in a CRM1T/M rotation cage. This slit ensures precise control of the amount of radiation that passes to the subsequent steps of the process mainly defining the spectrometer resolution. A second achromatic lens, AC254-050-A, collimates the light exiting the slit and adjusts it to the grating surface.

The beam then reaches the GR25-1205 diffraction grating, with a density of 1200 lines per millimeter. This grating disperses the light into various directions based on its wavelength, allowing the analysis of the radiation spectrum.

Ultimately, the dispersed radiation is focused by an objective on the detector, which records the intensity at each position, corresponding to specific wavelengths in the spectrum.

2.1.3. Detector/Spectrometer

A FLIR Blackfly GigE camera was used as the detector, model BFLY-PGE-31S4M-C. This choice was based on its full compatibility with the selected data collection software, ensuring perfect integration of the camera into the developed system. This PointGrey model offers a resolution of 2048 x 1536 pixels with a CMOS-type sensor. The focusing lens chosen to collect the Raman radiation was the 50 mm lens MVL50M23 from Thorlabs.

2.1.4. Data Collection Software

The software chosen for Raman spectra acquisition is the Spectrum Analyzer, available on OpenRAMAN website. This software was designed to operate with the type of system developed, minimizing risks of incompatibility and ensuring consistency in the spectra obtained. Additionally, the software is free and was previously validated for use with the selected camera model, guaranteeing compatibility with the system [12].

2.2. System Assembly

The assembly process is detailed on the OpenRAMAN website [10].

The baseplate was fabricated from an aluminum plate using computer numerical control (CNC) machining, according to the distances and dimensions specified on the OpenRAMAN platform. The cuvette holder used in the system was the CVH100/M model, chosen for its versatility, allowing the integration of various analyses techniques to examine the same sample.

To protect the Raman system instrumentation from external interference, two custom-made covers were designed to enclose the entire optical system. These covers were manufactured using a 3D printer, ensuring effective protection against light and external impurities. Below we present an image of the top view of the developed Raman system without the covers.

Figure 4. Top view of the developed Raman system.

The alignment and calibration of the Raman system were performed following the OpenRAMAN guidelines [10].

2.3. Optimization of System Components

2.3.1. Laser Operating Temperature Optimization

The implementation of the temperature control system also aimed to optimize the emission spectrum of the CPS532 laser. Adjusting the laser's operating temperature can enhance its spectral resolution and narrow its emission range, making it more efficient for Raman analysis.

To evaluate the laser’s behavior at various temperatures, two experiments were conducted. The first focused on understanding the effect of the temperature on the full width at half maximum (FWHM) of the laser peak. The laser spectrum was acquired using the CCS200/M spectrometer and the SMA905 fiber optic cable, which captured and transmitted the laser beam to the spectrometer. The fiber was integrated into the CVH100/M cuvette holder. The spectrometer has its own software, ThorSpectra. After setting up the equipment and connecting the spectrometer to the software, the laser spectrum was acquired at operating temperatures of 20 ᵒC, 25 ᵒC, 30 ᵒC, 35 ᵒC and 40 ᵒC. To prevent signal saturation, all spectra were acquired using an integration time of 0.1 ms.

Figure 5. CPS532 laser spectrum at 20 ᵒC, using the ‘Peack Track’ tool from ThorSpectra.

At an operating temperature of 20ᵒC, the CPS532 laser spectrum exhibits two peaks: a primary peak at 532.8 nm, with an intensity is 0.90 and a FWHM of 1013.7 pm; and a secondary peak at 537.3 nm, with an intensity of 0.04. The secondary peak is attributed to the use of the SMA905 optical fiber, which, although introduces this interference, is essential for ensuring the reproducibility and comparability of the results. Notably, the optical fiber won’t be used for Raman spectra acquisition, so this interference won’t affect the quality or accuracy of the spectra obtained with the system.

Subsequently, the laser's operating temperature was varied to 25 ᵒC, 30 ᵒC, 35 ᵒC and 40ᵒC, with the obtained spectra presented in Appendix A. The following figure illustrates the variation of the FWHM as a function of the laser’s operating temperature.

Figure 6. Influence of laser operating temperature on the FWHM of its main peak.

Contrary to expectations, the relationship between the CPS532 laser’s operating temperature and its FWHM is non-linear. The FWHM decreases until 30 ᵒC and increases at higher temperatures, peaking at 40 ᵒC, laser’s maximum operating temperature. At 40 ᵒC, the main peak exhibits instability in shape and a significant intensity reduction, dropping to 0.51. These results indicate that high temperatures negatively affect the laser performance, making operation under such conditions undesirable. At 30 ᵒC, the FWHM of the main peak reaches its minimum value of 649.8 pm, with an acceptable intensity of 0.82. Thus, 30 ᵒC is the temperature that minimizes the spectral range and that, consequently, increases the spectral resolution of Raman spectra.

In the second phase, the spectrum of ethanol (Chemical formula: CH₃CH₂OH) was acquired using the developed Raman system, at three different laser operating temperatures: 25 ᵒC, 30 ᵒC and 35 ᵒC. The main objective of this phase was to assess how the laser’s operating temperature influenced the quality and accuracy of Raman spectra.

The ethanol spectrum was acquired with the Spectrum Analyzer software, with the following acquisition parameters:

Table 1. Acquisition parameters used for the acquisition of ethanol spectrum.

Exposure Time(s).	Gain (dB)	ROI (px)	Average number of images
31.6	11.5	128	3

The visualization of the ethanol spectra was performed using the SpectraGryph software, version 1.2.

Figure 7. Raman Spectrum of ethanol at laser operating temperatures of 25 °C, 30 °C and 35 °C.

The presence of characteristic ethanol peaks in the spectrum, although not highly intense, demonstrates that the system operates with sufficient precision to capture molecular vibration signals from the sample.

When comparing the three ethanol spectra, the spectrum obtained at 35 ᵒC shows the lowest peak intensities, particularly in the zone (4), around 2800 to 3000 cm^-1. Between the spectra obtained at 25 ᵒC and 30 ᵒC, the peaks at 25 ᵒC exhibit slightly lower intensity. Considering that the first optimization phase identified 30 ᵒC as the optimal operating temperature for enhancing laser performance and given the quality of the ethanol spectrum acquired at this temperature, we concluded that 30 ᵒC maximizes the overall efficiency of the developed Raman system.

2.3.2. Optimization of Spectrum Analyzer Software Acquisition Parameters

The software used allows adjustments of acquisition settings, such as exposure time and average number of images. These adjustments are critical for ensuring results stability and to obtain high-quality spectra with a strong signal-to-noise ratio (SNR).

For this optimization, the ethanol Raman spectrum was acquired using different acquisition parameters, with the laser operating at 30 ᵒC. Due to time constraints, each parameter was only varied three times.

In the first stage, the exposure time, that controls the time during which the PointGrey camera "captures" light, was evaluated.

Table 2. Acquisition parameters used to study the impact of exposure time on the quality of the ethanol Raman spectrum.

Name	Exposure time(s)	Gain (dB)	Average number of images
Ethanol 2.1 s	2.1	3.8	2
Ethanol 10 s	10	3.8	2
Ethanol 31.6 s	31.6	3.8	2

It is important to note that the maximum exposure time allowed by the software is 31.6 seconds. Additionally, spectra acquired with exposure times shorter than 2.1 seconds exhibit high noise levels, which made their analysis exhaustive and confusing.

Figure 8. Impact of exposure time on the quality of the ethanol Raman spectrum.

The shortest exposure time of 2.1 s resulted in a Raman spectrum with high noise, which interfered with the ethanol peaks by artificially increasing their intensity. However, this increase was solely due to the added noise, compromising the spectrum analysis. The noise may have originated from the instrumentation, including laser or thermal fluctuations or electronic noise. Increasing the exposure time to 10 s significantly reduced the noise, as expected, since a longer acquisition time allows the camera to capture more useful signals while minimizing noise. Further increasing the acquisition time to the maximum value, 31.6 s, caused subtle differences in noise levels, especially in the range between 1500 cm^-1 and 2800 cm^-1. Although this longer exposure didn’t substantially improve the ethanol spectrum, it can be beneficial for urine spectra, which tend to have a high noise level. Based on this we conclude that an exposure time of 31.6 s is optimal for improving Raman spectra quality, especially for urine analysis.

Then, the impact of the number of images acquired and combined to generate the final spectrum was evaluated.

Table 3. Acquisition parameters used to study the impact of the average number of images on the quality of the ethanol spectrum.

Name	Exposure time(s)	Gain (dB)	Average number of images
Ethanol 2 images	10	3.8	2
Ethanol 12 images	10	3.8	12
Ethanol 24 images	10	3.8	24

Figure 9. Impact of the average number of images on the quality of the ethanol Raman spectrum.

The analysis of the obtained spectra reveals that when only 2 images are acquired the noise is significantly higher, particularly in regions without peaks, complicating the analysis of more complex spectra, such as urine spectra. Increasing the number of images to 12, the noise level in these regions decreases considerably, while preserving the intensity of ethanol’s characteristics peaks. This provides the highest SNR ratio, as it minimizes noise while maintaining distinct peaks. Acquiring spectra with a higher number of images resulted in a similar noise level in the regions without peaks. However, the intensity of the peaks reduced considerably. Therefore, we conclude that the average number of images that improves the overall quality of the spectrum while maintaining a high SNR ratio is 12 images.

The gain parameter, which represents the amplification of the electrical signal generated by the camara’s sensor after capturing the light, was also evaluated. However, the ethanol spectra didn’t exhibit significant variations in noise or intensity, making it impossible to draw concrete conclusions about its impact.

This study faced limitations, as it wasn’t feasible to explore all possible parameter values. However, the optimal settings to enhance the quality of the ethanol spectrum were: 31.6 seconds as exposure time and an average of 12 images.

2.4. Role of the Laser’s Power in Raman Spectrum

Maintaining the overall hardware configuration unchanged, we tested our system with a new laser with a higher power, exceeding 80 mW, and a short spectral linewidth (nominal FWHM = 0.003 nm) manufactured by Frankfurt Laser Company, model FPYL-532-80T-LN-S. To evaluate its performance, compared to the previous used laser, we acquired the ethanol spectrum under similar conditions.

Figure 10. Comparison of the ethanol spectrum obtained with two different lasers.

The differences between the spectra are notable, with the higher-powered laser providing a substantial enhancement in the intensity of the Raman peaks, lower noise and shorter acquisition times. Using the CPS-532 laser, the maximum intensity achieved was only 0.035, notably lower. In contrast, the FPYL-532 laser allows for intensities to reach up to 4, representing a remarkable improvement. This enhancement directly improves the quality and clarity of the obtained data, highlighting the critical role of laser power in Raman spectroscopy analyses.

However, this laser is significantly more expensive than the used in this system, making it unsuitable for low-cost approaches.

3. Urine Spectrum Acquisition

To validate the developed system for urine spectra acquisition, its performance was evaluated in this context. The focus of this evaluation wasn’t to identify the biomarkers present in the urine, nor to conduct their detailed analysis, but to ensure that the Raman system is able to acquire urine spectra with precision and consistency. For this purpose, five samples of urine were used. The acquisition parameters used are described in the table below.

Table 5. Acquisition parameters used for the acquisition of urine spectra.

Exposure Time(s)	Gain (dB)	ROI (px)	Average number of images
31.6	1.5	128	10

The visualization of the urine spectra obtained was performed using the SpectraGryph software, version 1.2.

Figure 11. Raman spectrum of the five urine samples.

By analyzing the obtained spectra, it is evident that all five samples exhibit consistent peaks around 415 cm^-1 and 735 cm^-1. Regarding the 415 cm^-1, a limited amount of information is available on literature, however this peak is likely associated with the presence of glycogen in urine, that typically exhibits a Raman peak around 480 cm^-1 [13]. The 735 cm^-1 peak is probably related to creatinine, despite its usually reported values being around 640 cm^-1 and 700 cm^-1 [5], [14,15,16]. Additionally, a broader and less intense peak is observed between 990 cm^-1 and 1090 cm^-1, likely associated with urea, which typically presents Raman peaks around 1000 cm^-1 and 1006 cm^-1 [5,14,15]. The intensities of these peaks varied across samples, demonstrating the system’s sensitivity to the differing compounds concentrations in urine and its consistency in the acquisition of the spectra. Overall, the intensity of the peaks remained relatively low, with noise present throughout the spectra. Enhancing peak intensity and reducing noise would require a laser with higher spectral resolution and greater power.

This analysis demonstrates that the Raman system developed is both consistent in its spectra acquisitions and sensitive to variations in compounds concentration within a complex matrix. These results reinforce the system's potential to detect changes in urine composition, essential for its application in future diagnostic analyses.

4. Development of the Supervised Learning Model

4.1. Data Acquisition

With the development of a learning model, we aim to evaluate its capability to differentiate and classify Raman spectra. If successful, the model could contribute to making the diagnosis of kidney diseases simpler, faster and more effective.

To train the model, Raman spectra of six methanol-ethanol solutions with different compositions were acquired, totaling 619 spectra. Between each spectrum, the acquisition parameters of the Spectrum Analyzer software were varied, to guarantee dataset diversity. The data was divided into two sets, the training set (80%) and the test set (20%).

Table 5. Data used for the supervised learning model.

Solutions	Number of Spectra Obtained	Total No. of Data
100% Ethanol	106	619
90% Ethanol 10% Methanol	103
80% Ethanol 20% Methanol	103
50% Ethanol 50% Methanol	102
20% Ethanol 80% Methanol	102
100% Methanol	102

Next, we present a graph showcasing an example spectrum for each alcohol solution used in training the supervised learning model. This visualization highlights the spectral differences among the six solutions, exemplifying the variability in the dataset used.

Figure 12. Raman spectra of alcohol solutions.

The learning model was developed using MATLAB, and the developed code is available in Appendix A.

4.2. Learning Model Optimization

Hyperparameters significantly influence the model’s performance, particularly its training speed and accuracy. These hyperparameters can optimize critical aspects of the model, making it more robust and efficient.

MaxEpoch hyperparameter defines the maximum number of training iterations, epochs. Higher epoch values allow the model to learn more effectively but increase the risk of overfitting and extend training time. During an epoch, the model processes the entire training set, dividing it into mini-batches, another key hyperparameter. Mini-batches size impacts training efficiency, as larger groups can result in a longer training time.

The InitialLearnRate hyperparameter defines the rate at which the model adjusts its weights during training. This rate directly influences the speed and effectiveness of the training process. An improperly adjusted rate can prevent the model from converging and learning from the training data.

The hyperparameters used for this optimization and their impact on the model performance (Accuracy, Precision and Training Time) are described in the table below.

Table 6. Analysis of how changing hyperparameters impacts model evaluation.

Hyperparameters of Training			Model Performance
MaxEpoch	InitialLearningRate	MiniBatchSize	Accuracy (%)	Precision (%)	Training time
20	0.001	64	96.75	96.79	24 min 04 s
20	0.001	32	98.37	98.71	10 min 52 s
20	0.001	16	100	100	05 min 31 s
30	0.001	16	99.19	99.21	08 min 20 s
10	0.001	16	99.19	99.21	02 min 45 s
20	0.1	16	17.07	16.67	05 min 28 s
20	0.01	16	87.81	87.78	05 min 30 s

Analyzing the results, the hyperparameter with the greatest influence on the model’s accuracy and precision InitialLearningRate. When this parameter was increased from 0.001 to 0.01, the model performance declined. With a rate of 0.1, accuracy dropped significantly to 17.07% and precision to 16.67%, as excessively high learning rates can cause large weigh adjustments, preventing the model from adequately capturing data patterns. Thus, the optimal learning rate for the developed model is 0.001.

Adjusting the Mini-batches size especially influenced the training time. With 16 batches the model achieved 100% accuracy and precision, within approximately 5 minutes of training.

The MaxEpoch hyperparameter didn’t behave as anticipated. Increasing the number of epochs from 20 to 30 resulted in a slight decrease in performance, indicating that the model could be overfitting to the training data. On the other hand, with the decrease in epoch cycles from 20 to 10, model accuracy and precision stayed equal, though it minimized training time to just about 2 minutes.

In summary, the hyperparameters combination that maximize model performance while maintaining efficient training times are: MaxEpoch of 20, InitialLearningRate of 0.001, and MiniBatchSize of 16.

These results highlight the potential of the developed learning model for Raman spectrum classification, evidencing its capability to identify patterns in these spectra.

5. Conclusions and Future Perspectives

This study demonstrated the potential of a simplified and portable Raman spectroscopy system for acquiring the spectra of liquid samples, including urine. The application of this system could simplify and enhance the accessibility of kidney disease diagnostics, even outside clinical contexts.

The system’s main challenge was the high noise levels and the low intensity of the characteristic peaks in the obtained spectra. Overcoming this issue would require a higher-power laser, since the one used in this study had a relatively low power of 4.5 mW. Additionally, the urine’s fluorescence spectrum indicates low fluorescence intensity when irradiated with a red laser with 635 nm, suggesting potential development of a Raman system using a high-power laser, with this emission wavelength.

The assembly, alignment and calibration of the Raman system were successfully achieved at this stage. In a future phase, the use of more sophisticated optical elements could be explored, to amplify Raman scattering. Similarly, the acquisition parameters were optimized using ethanol spectra, however, their direct impact on urine spectra should be explored in subsequent phases.

The analysis of the five urine spectra with the developed system demonstrates consistency and sensibility to variations in compound concentrations in urine.

The implementation of a supervised learning model for spectra classification proved to be an effective tool, capable of recognizing patterns in Raman spectra and achieving accurate classification. The application of these kinds of models can facilitate the differentiation between healthy and pathological spectrum. Training this model with urine spectra to identify kidney disease biomarkers represents a promising opportunity to be explored.

Despite limitations imposed by instrumentation and budget, this work paves the way for the development of a Raman spectroscopy system that, combined with machine learning, creates an accessible, portable and effective tool for non-invasive diagnosis, particularly for kidney disease detection.

Author Contributions

Conceptualization, C.D. and A.F; methodology, C.D. and M.F. software, C.D. and A.F; validation, A.F, S.A.P; investigation, C.D. ; data curation, C.D.; resources, M.F.; writing—original draft preparation, C.D.; writing—review and editing, A.F.; supervision, A.F., M.F, and S.A.P.; project administration, A.F.; funding acquisition, A.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Fundação para a Ciência e Tecnologia (FCT), Center of Technology and Systems (CTS) UIDB/00066/2020 / UIDP/00066/2020, and by project IPL/IDI&CA2023/LUMINA_ISEL by Instituto Politécnico de Lisboa (IPL/2023/IDI&CA).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data used in this project as well as a version of the algorithm are available upon request.

Acknowledgments

The authors would like to thank Caterina Serafinelli for preparing the alcohol solutions used for training the supervised learning model. The collaboration of Paolo Di Giamberardino, under the Sapienza Visiting Professor Programme 2022, for the designing of the Matlab algorithm is also acknowledged.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Figure A1. CPS532 laser spectrum with an operating temperature of 25 ᵒC.

Figure A2. CPS532 laser spectrum with an operating temperature of 30 ᵒC.

Figure A3. CPS532 laser spectrum with an operating temperature of 35 ᵒC.

Figure A4. CPS532 laser spectrum with an operating temperature of 40 ᵒC.

Figure A5. Code developed for the creation of the supervised learning model.

References

N. Kuhar, S. Sil, T. Verma, and S. Umapathy, “Challenges in application of Raman spectroscopy to biology and materials,” 2018, Royal Society of Chemistry. [CrossRef]
J. A. Kellum, P. Romagnani, G. Ashuntantang, C. Ronco, A. Zarbock, and H. J. Anders, “Acute kidney injury,” Dec. 01, 2021, Nature Research. [CrossRef]
S. G. Coca and C. R. Parikh, “Urinary biomarkers for acute kidney injury: Perspectives on translation,” Mar. 2008. [CrossRef]
F. Turgut, A. S. Awad, and E. M. Abdel-Rahman, “Acute Kidney Injury: Medical Causes and Pathogenesis,” Jan. 01, 2023, MDPI. [CrossRef]
M. J. Jeng et al., “Raman Spectral Characterization of Urine for Rapid Diagnosis of Acute Kidney Injury,” J Clin Med, vol. 11, no. 16, Aug. 2022. [CrossRef]
E. Smith and G. Dent, Modern Raman Spectroscopy - A Practical Approach, 2nd ed. Wiley, 2019.
Paschotta and Rüdiger, Field Guide to Lasers, vol. FG12. SPIE, 2007.
J. R. Ferraro, K. Nakamoto, and C. W. Brown, Introductory Raman Spectroscopy. Academic Press, 2003.
A. Fantoni et al., “The LUMINA setup for a light-based urine monitoring and analysis,” SPIE-Intl Soc Optical Eng, Jun. 2024, p. 26. [CrossRef]
“Starter Edition Assembly – OpenRAMAN.” Accessed: Jul. 01, 2024. [Online]. Available: https://www.open-raman.org/build/starter-edition/assembly/.
G. G. Hoffmann, Infrared and Raman spectroscopy : Principles and Applications. 2023.
“Spectrum Analyzer Suite.” Accessed: Sep. 23, 2024. [Online]. Available: https://www.open-raman.org/build/software/.
N. Stone, C. Kendall, J. Smith, P. Crow, and H. Barr, “Raman spectroscopy for identification of epithelial cancers,” Faraday Discuss, vol. 126, no. 1, pp. 141–157, 2004. [CrossRef]
J. A. M. Bispo, E. E. de Sousa Vieira, L. Silveira, and A. B. Fernandes, “Correlating the amount of urea, creatinine, and glucose in urine from patients with diabetes mellitus and hypertension with the risk of developing renal lesions by means of Raman spectroscopy and principal component analysis,” J Biomed Opt, vol. 18, no. 8, p. 087004, Aug. 2013. [CrossRef]
C. Chen et al., “Urine Raman spectroscopy for rapid and inexpensive diagnosis of chronic renal failure (CRF) using multiple classification algorithms,” Optik (Stuttg), vol. 203, Feb. 2020. [CrossRef]
X. Dou, Y. Yamaguchi, H. Yamamoto, S. Doi, and Y. Ozaki, “Quantitative Analysis of metabolites in urine by anti-Stokes Raman spectroscopy,” Biospectroscopy, vol. 3, no. 2, pp. 113–120, 1998. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.