1. Introduction
Sunflower oil is produced in several countries, including Russia, Ukraine and China from the seeds of
Helianthus Annuus L., [
1]. In the European Union, represents the second most important source of seed oils after rapeseed oil, and the area under sunflower crops will increase up to 1.0 million ha by the end of 2031 [
2]. The plant grows best in dry, temperate climates (among 20-25ºC) with high solar radiation, low humidity and deep soils to spread its roots in search of nutrients and water. Their deshelled seeds are responsible for 80% of the total fruit weight and has a high oil content up to 55% w/w [
3,
4]. Thus, these oils have a high lipids content, where most of them are triglycerides composed mainly of long-chain unsaturated fatty acids with different unsaturation, being linoleic acid (59%, polyunsaturated omega 6) the most important [
5,
6]. In addition, it should be noted the presence of other fatty acids as oleic acid (30%, monounsaturated omega 9), stearic acid (6%, saturated) or palmitic acid (5%, saturated).
Like Olive Oil (OO), there are several commercial categories, the main ones being: i) Refined Sunflower Oil (SFO), characterized by a high linoleic acid (LA) content. ii) High Oleic Sunflower Oil (HOSFO), with an oleic acid (OA) content of at least 75%; measured as a percentage of the total fatty acid content, and iii) Medium Oleic Sunflower Oil (MOSFO), with a seed OA content ranged among 50% and 75%. The last two commercial categories come from seeds genetically modified to increase naturally not only the ratio of oleic to linoleic acid but also other monounsaturated fatty acids or vitamins such as vitamin E. These facts confer to these oils, in especial to HOSFO, an overall composition with remarkable similarities to OO, and as an extra factor, greater resistance to oxidation and possibilities of use [
7,
8].
Currently, the main analytical methods used to check the authenticity and evaluate possible adulteration of edible vegetable oils, especially OO, are chromatography-based analysing the presence of triglycerides and fatty acids [
9,
10]. The most used are Thin-Layer Chromatography (TLC), High-Pressure Liquid Chromatography (HPLC) and Gas Chromatography (GC). The former, TLC is used both, qualitative and quantitatively in official methods such as the analysis of sterols and stanols fractionation in oils [
11,
12]. HPLC as, among other uses, an International Olive Council reference method commonly used in routine laboratories [
13,
14]. The later, GC uses not only as GC-FID (Flame Ionization Detector) official method specified in IOCs guides, but also uses for characterising the triglyceride profile in vegetable oils, or the separation and quantification of fatty acids according to their esterified fraction (FAME), by means of a prior hydrolysis and methylation of fatty acids step [
15,
16,
17].
As an alternative to chromatographic techniques, the spectroscopic ones (Raman, MIR, FIR, NMR, etc.) provide simple, fast and reliable results, with the advantage of using directly on the sample without the need for any sample pre-treatment stage. Moreover, they appear to be not only economical but also environmentally sustainable analytical methods [
17,
18,
19]. NMR provides a quick way of measuring the oil content of oil seeds such as sunflower seeds, as well as assessing their resistance to high temperatures by evaluating the degradation of their constituents, as their main domestic use
i.e. food frying [
20]. Some of these techniques, such as Raman or NIR spectroscopy, can be easily adapted to portable devices, allowing the acquisition of analytical data "in situ"/in real time, although in many cases these techniques have proven to be incapable of measuring through packaging. To overcome this inconvenient, some instrumental improvements has been made to Raman spectroscopy, as in the case of Spatially offset Raman spectroscopy (SORS) technique [
21]. Despite of these advantages, they have a lower resolution capability than chromatographic techniques, as they do not provide information on each individual compound present, but rather on the bonds that occurs in the components of a sample as a whole.
Considering that the final output of an analytical device (bench top, handheld or portable devices) is an analytical signal, the use of this to obtain information about the properties of a material related to its chemical composition e.g. fingerprinting methodology, has shown to be a powerful tool to identify, discriminate and authenticate edible oils among other commodities. In this sense, these signals may relate to the identity of a food, to its physico-chemical or other natural properties, as well as to the presence or quantity of compounds in the chemical composition of the food [
22,
23]. Despite of this, instrumental fingerprints contain hidden information about it, which normally requires the use of chemometric tools in particular or machine learning methods in general [
24,
25,
26].
In the food sector, the fingerprinting methodology has been widely used to solve different problems such as the authentication and discrimination of olive oils, margarines, and fat spread or the evaluation of the quality and production process of spirits among others foodstuff [
27,
28,
29,
30,
31] Regarding the use of handheld or portable devices based on SORS applications few studies have been carried out despite the great potential of the technique
[Error! Bookmark not defined.
]. Recent examples include the detection of possible adulteration in alcoholic beverages [
32], the study of the evolution of the alcoholic fermentation process in white wine [
33], the authenticity assessment of the animal milk used origin of the in the production of cheese, the characterisation of several commercial categories of cheese (Cheddar, Manchego and Pecorino Romano), the analysis of packaged margarines and fat spreads [
34,
35,
36] or the study of adulteration of extra virgin olive with other edible vegetable oils as in the case of Varnasseri et al. [
37].
Thus, the aim of this work is to propose a more environmentally friendly and sustainable analytical method that combine the Raman fingerprints obtained from a portable instrument with chemometrics/machine learning to build classification models to reliably differentiate sunflower oils from olive oils of different commercial categories.
2. Materials and Methods
2.1. Oil Samples
A sample bank consisting of 145 samples from different types of edible vegetable oils, purchased at local supermarkets, was used for this study, separating these oils into two main groups: sunflower and olive oil. The first group was constituted by 48 samples where 7 of them were labelled as high oleic acid content. The second group was composed by a total of 97 samples from which: 65 were extra virgin olive oil (EVOO), 22 were virgin olive oil (VOO) and 10 were pomace olive oil (POO).
2.2. Spectroscopic Analysis
A Vaya Raman portable spectrometer (Agilent Technologies, Santa Clara, CA, USA) was used in this study. It was equipped with a 3B class diode laser operating at 830 nm and a maximum power of 450 mW. The exposure time of the sample to the laser varied between 0.5 and 2 s. The spectral resolution of the device was 12-20 cm-1 with the wavenumber range among 735-1540 cm-1 using a cooled CCD (charge coupled devices) as detection system. The specific offset length from the incidence point was fixed by equipment at 0.6 cm. For measurement (with a total time of the measurement ranged between 30 s and 2 min), edible oil samples were introduced into 4mL silicate boron of 1mm thickness.
After carrying out the measurement, the equipment software performs: i) a correction from the information obtained to eliminate any possible influence of the container, ii) a baseline adjustment and finally ii) a normalization of the intensity values. Therefore, the result of each analysed sample is a Raman spectrum from 350 to 2000 cm-1 in which the intensity values are between 0 and 1. Thus, the final spectrum was a normalized spatial resolved RAMAN spectrum (NSR RAMAN Spectrum).
2.3. Data Treatment
Each NSR Raman spectrum was collected in .CSV format (comma separated value) and exported to .mat format for the subsequent elaboration of the fingerprints matrix within the MATLAB environment (version 9.3, Mathworks Inc., Natick, MA, USA). As a result, a comprehensive 145×1651 data matrix consisting of 145 NSR RAMAN fingerprints (rows), each of which is composed of 1651 normalized intensity values (columns), was obtained as raw data matrix and once pre-processed, used for the different chemometric studies.
The pre-processing stage included: a selection of interval of interest for each fingerprint, a filtering smoothing of the signals using a Savitzky-Golay filter (1
st derivate, 2
nd order polynomial and a filter width of 21 points) and a mean center, obtaining a reduced-pre-processed NSR RAMAN fingerprints matrix of 145 samples × 902 variables.
Figure 1 show the NSR RAMAN fingerprint edible vegetable olive oils analysed for this study before and after pre-processing step. Unsupervised and supervised pattern recognition techniques were explored using PLS_Toolbox, version 8.6.1 (Eigenvector Research Inc., Manson, WA, USA) [
38].
Conclusions
The chemometric/machine learning study of the spectroscopic instrumental fingerprints of edible vegetable oils (sunflower and olive oils) obtained from a highly versatile portable analyser based on Spatially Offset Raman Spectroscopy (SORS) showed that they are capable to distinguish the analysed samples attending to the original raw material used in the oil's elaboration.
In addition, the different zones of the normalized spatial resolved RAMAN fingerprints allowed not only the capability to group the samples attending to the raw material but also to the intra-group differentiation among types of oils. Thus, by means of the unsupervised techniques used (HCA and PCA), it was demonstrated that the most influential attribute was the raw material, e.g. sunflower or olive oil, followed by the types (commercial categories) of each type of oil, both determined by the oleic acid content and the oleic/linoleic acid ratio in the analysed samples.
By applying supervised techniques different models to discriminate (SVM and kNN) and classify (SIMCA) were obtained. Although reliable quality metrics of the models were satisfactory, as far as authors concern, the best results were obtained in SIMCA model. This soft classification model permitted not only to classify the samples attending to the oil raw material (seed or fruit) but also to classify those samples belonging to HOSFO as inconclusive due to their high oleic acid content similar to olive oil.
Author Contributions
Conceptualization, M.Gracia Bagur-González and Antonio González-Casado; Formal analysis, Fidel Ortega-Gavilán and M.Gracia Bagur-González; Funding acquisition, Antonio González-Casado; Investigation, Guillermo Jiménez-Hernández and Fidel Ortega-Gavilán; Methodology, Guillermo Jiménez-Hernández, Fidel Ortega-Gavilán and M.Gracia Bagur-González; Project administration, M.Gracia Bagur-González and Antonio González-Casado; Resources, Antonio González-Casado; Software, Guillermo Jiménez-Hernández, Fidel Ortega-Gavilán and M.Gracia Bagur-González; Supervision, Fidel Ortega-Gavilán, M.Gracia Bagur-González and Antonio González-Casado; Validation, Fidel Ortega-Gavilán and M.Gracia Bagur-González; Writing – original draft, Guillermo Jiménez-Hernández, Fidel Ortega-Gavilán and M.Gracia Bagur-González; Writing – review & editing, Fidel Ortega-Gavilán, M.Gracia Bagur-González and Antonio González-Casado.