Abstract
Hyperspectral technology has become increasingly important in monitoring soil heavy metal pollution, yet hyperspectral data often contain substantial band redundancy, and band selection methods are typically limited to single algorithms or simple combinations. Multi-algorithm combinations for band selection remain underutilized. To address this gap, this study, conducted in Gejiu, Yunnan Province, China, proposes a multi-algorithm band selection method to enable rapid prediction of lead (Pb) contamination levels in soil. To construct a preliminary Pb content prediction model, initial selection of spectral bands utilized methods including CARS (Competitive Adaptive Reweighted Sampling), GA (Genetic Algorithm), MI (Mutual Information), SPA (Successive Projections Algorithm), and WOA (Whale Optimization Algorithm). Results indicated that WOA achieved the highest modeling accuracy. Building on this, a combined WOA-based band selection method was developed, including combinations such as WOA-CARS, WOA-GA, WOA-MI, and WOA-SPA, with multi-level band optimization further refined by MI (e.g., WOA-GA-MI, WOA-CARS-MI, WOA-SPA-MI). Results showed that the WOA-GA-MI model exhibited optimal performance, achieving an average R² of 0.75, with improvements of 0.32, 0.11, and 0.02 over the full-spectrum model, the WOA-selected spectral model, and the WOA-GA model, respectively. Additionally, spectral response analysis identified 22 common bands essential for Pb content inversion. The proposed multi-level combined model not only significantly enhances prediction accuracy but also provides new insights into optimizing hyperspectral band selection, serving as a valuable scientific foundation for assessing soil heavy metal contamination.