3.1. Foliar Nitrogen Content
The descriptive analysis of the nitrogen content present in common bean leaves submitted to four levels of nitrogen fertilization is shown in
Figure 3. In general, the arithmetic mean of the leaf N content increased as a function of the applied doses, reaching values of 37.02, 46.68, 51.72 and 58.9 g kg
-1 for T1 (0 kg ha
-1 N), T2 (50kg ha
-1 N), T3 (100 kg ha
-1 N) and T4 (150 kg ha
-1 N), respectively.
The leaf N content obtained in this study is in agreement with those obtained by [
34] who studied 16 varieties of common bean submitted to two doses of N (0 e 100kg ha
-1) and found a mean leaf N content of 37.6 51.7 g kg
-1 in BRS amethyst and 36.1 51.7 g kg
-1 in Dama TAA in the absence of N. Regarding the dose of 100 kg ha
-1, our mean result, 51.7 g kg
-1, was slightly higher than the value observed by the authors, 41.8 g kg
-1, for cultivar IAC millennium (
IAC milênio).
3.2. Leaf Spectral Analysis
The mean reflectance curves of the cultivar BRS FC104, comprised in the near-infrared range - NIR (700 to 1300nm) and resulting from the application of the four nitrogen treatments (
Figure 3), are shown in
Figure 4. The spectral behavior of leaves with different concentrations of N showed little variation in reflectance, being between 0.45 and 0.55 on the y-axis, where the two smallest reflectance curves (green and blue lines) represent the concentrations of 60.6 and 51.7 g kg
-1 of N, respectively.
Near-infrared wavelengths (between 720 and 1300 nm) refer to the scattering of light along the mesophyll under the influence of internal leaf structures such as cell wall width, intercellular air spaces, and the amount of mesophyll per unit leaf area within the mesophyll [
35,
36]. Reflectances around 0.5 obtained in healthy bean plants were measured by [
37] in the NIR region, showing satisfactory vegetative vigor of the crop at 25 DAS. When fertility is adequate, plants are more photosynthetically active, which characterizes greater absorption of electromagnetic energy in the visible region and greater reflectance in the red border and near-infrared regions [
38]. Other plant species also express similar spectral behavior with respect to reflectance. Assessing the hyperspectral response of species
Megathyrsus maximus, Pennisetum purpureum Schumach, Philodendron sp. Tradescantia pallida cv. purpurea, Cordyline fruticosa (L.) and
Cordyline fruticosa (L.), in the near-infrared region (700 - 1300 nm), reflectance of approximately 50% with progressive decreases up to 1058 nm were observed by [
35].
Two spectral zones most correlated with leaf nitrogen content were identified using the VIP index across the NIR spectrum, as shown in
Figure 5. The first is located in the range of 700 to 740nm, with the highest VIP (4.1) observed at the 708nm wavelength - blue dotted line. In the second, VIP scores higher than 1 occurred only at wavelengths 983, 994 and 995nm, and the value verified at 988nm was observed at wavelength 708nm.
Several studies have proven the efficiency of the interval between 700 - 740 nm in the study of leaf N in several plant species. Evaluating the importance of spectral bands in the prediction of N in sugarcane crops, Silva et al., (2023) observed VIP ranging from 1 to 1.5 for sugarcane. The accuracy of reflectance spectroscopy is further refined by selecting the most responsive wavelengths for analysis, as different plant features are more discernible at specific wavelengths [
39,
40]. This selection process is critical to generating reliable and actionable data [
40].
RF stood out as the most accurate model in the prediction of N when the entire NIR spectrum (700 - 1300 nm) was used, expressing an excellent coefficient of determination (R
2 = 0.84) and lower error (RMSE = 2.69) between observed and predicted values (
Figure 6 - A). The scatter plots showing the performance of the KNN and M5Rules models reveal reduced capacity when compared to RF, but efficient in the estimation of N, both with R
2≥0.7 and errors lower than 4 g kg
-1 (
Figure 6-B and D). ANN was the least appropriate model to deal with estimation of N from NIR reflectance, expressing the largest error.
These results reflect the robustness of the RF model applied to the prediction of leaf N in the bean crop. One explanation for this is the fact that this algorithm is composed of multiple trees trained through bagging and a random variable selection process, having excellent capability against noise and outliers in the database [
41,
42]. RF demonstrates aptitude for data with nonlinearity inherent in the relationship between spectral variables and biophysical or biochemical parameters. Applied to predictions of nitrogen and leaf chlorophyll in maize crops, Random Forest outperformed the ANN, M5P, REPT, SVM and ZR algorithms, using hyperspectral data as input [
43]. Evaluating models based on in-situ hyperspectral data to predict nitrogen concentration in three legumes (soybean, teparian bean, moth bean) with four machine learning algorithms, Flynn et al., (2023) found the superiority of RF (R
2 = 0.72) compared to KNN, PLS, and SVM.
On the other hand, the ANN model has a high capacity for nonlinear approximation and excellent generalization [
44]. However, in this study, ANN did not obtain satisfactory performance when compared to the other models. This fact may have occurred due to the need for successive modifications in the hyperparameters of the network for optimization in the prediction [
45].
The selection of the most significant wavelengths by VIP for N prediction, located in the spectral ranges between 700 - 740 nm and 983 - 995 nm, (
Figure 5) increased the R
2 of the RF, KNN and M5 algorithms by 6%, 4% and 8%, respectively, improving the predictive capacity (
Figure 7). On the other hand, RNA's performance was reduced by 3% when the two spectral intervals obtained by VIP were used.
These results are similar to those obtained in the literature. Fiorio et al., (2024b) obtained more efficient predictions of leaf nitrogen content in sugarcane from hyperspectral reflectance data using PLSR, considering only spectral ranges obtained by VIP. The research conducted by Azadnia et al., (2023), studied the prediction of N, phosphorus (P) and potassium (K) in apple trees with spectroscopy data and also proved the performance gain of machine learning algorithms as a function of the choice of the most effective intervals using VIP [
46].
Comparing the performance results of KNN dealing with raw spectral data and only with the wavelengths selected by VIP, a 25% reduction in error is observed, from 3.86 g kg
-1 to 2.89 g kg
-1. This considerable difference may have occurred because the KNN algorithm is more sensitive to data quality than other algorithms, and the choice of more relevant variables increases its capacity for generalization, interpretability, and computational efficiency [
47,
48].