1. Introduction
The ability to adequately characterize and assess forest structure with a high level of accuracy is not only important for the development of a reliable forest plan but is also informative for assessments that demonstrate the sustainable management of natural resources [
1,
2]. Within this context, an understanding of forest conditions, particularly growing stock (tree and stand volume), aboveground biomass, and basal area, is crucial for planners, forest managers, and landowners. These attributes directly influence the potential revenue and the potential habitat a forest can provide and facilitate opportunities for addressing other management objectives [
2,
3]. Furthermore, a regularly updated forest inventory is essential for monitoring the spatiotemporal dynamics of forest ecosystems over the length of a planning horizon. For example, an estimate of aboveground biomass can provide insights into the capacity of a forest to sequester carbon, which is considered a critical factor in addressing climate change. This issue may become more important in the future as the forestry sector faces increasing pressure to assess the ability and rate of forests to sequester carbon [
4,
5,
6,
7]. Additionally, estimates of biomass can provide valuable information for assessing forest health related to the outbreak of Southern pine beetle [
8] and for assessing fire risk associated with fuel management [
9].
Traditionally, an estimate of a forest inventory has heavily relied on labor-intensive and time-consuming field measurements. Timely field measurements may be limited in spatiotemporal coverage, may include sampling error, and may not be representative of large forested areas [
10]. While measuring the diameter at breast height (dbh) of trees, identifying the tree species, and counting trees within a sample unit (e.g., plot or prism point) are relatively straightforward methods, estimating the height of each tree may require more effort and include more uncertainty [
11], especially in natural, mountainous forest landscapes [
7]. Endeavors meant to obtain a sufficient number of well-distributed sampling plots that properly represent an entire forest area remain challenging due to limited resources and accessibility. To account for these limitations, traditional field measurements have been complemented by products derived from remote sensing systems, which may help address spatial and temporal challenges in developing a forest inventory [
3,
12]. This approach is sometimes referred to as an enhanced forest inventory [
13,
14]. Within this framework various remote sensing systems have been demonstrated to provide relatively accurate and cost-effective forest information including the development of forest metrics [
2,
3,
10,
15,
16] and greenhouse gas inventories [
17].
Data acquired from Light Detection and Ranging (LiDAR) devices can assist in the development of forestry information. Unlike other remote sensing processes that only provide two-dimensional information, LiDAR can characterize three-dimensional forest structure to a certain extent based on the point density of the LiDAR data [
5,
16,
18]. LiDAR can help with deriving height estimates of an object through the time interval between the emission of a pulse of energy by the LiDAR sensor and the moment that the reflected signal has been returned to the device [
19]. This type of active remote sensing technology has advanced significantly in the last ten years, resulting in a diverse range of LiDAR systems that can be installed in satellites, mounted on airplanes and unmanned aerial vehicles (UAVs), or used as hand-held devices. Each type of system has advantages and disadvantages. For instance, one advantage of using satellite-based laser scanning platforms such as NASA’s Global Ecosystem Dynamics Investigation (GEDI) and Ice Cloud and Land Elevation Satellite 2 (ICESat-2) is that they can provide multi-temporal data for the entire Earth. For instance, Potapov et al. [
20] was able to create a global canopy height model using GEDI data along with Landsat imagery. Additionally, Dubayah et al. [
21] presented the estimate of mean biomass densities for every country covered by GEDI with 1 km resolution.
Terrestrial-based LiDAR platforms can be categorized as either static or mobile. Terrestrial laser scanners (TLS) capture point clouds by looking upward from ground level, so they are advantageous in capturing details of forest structure from under the canopy. Especially, static TLS platforms (consisting of a sensor, a tripod, and a GNSS receiver) produce the highest-quality point clouds among LiDAR systems [
5,
22]. Appropriately employed TLS can facilitate visualization of tree branches and leaves; therefore, these systems have enormous potential for assisting in the development of new allometric biomass and wood quality relationships [
23]. Arseniou et al. [
24], for example, was able to estimate woody aboveground biomass for urban and rural settings using a TLS platform to identify various tree parts other than the main stem. Nevertheless, TLS platforms have several limitations when employed for forest inventory and operational forest management purposes. One, an occlusion effect, is caused by features hidden or obstructed behind larger diameter trees within the point cloud data, and it poses a challenge when a fixed-position, single-scan approach is used [
25]. In addition, the weight and size of a TLS platform can make the effort of moving between field measurement plots challenging. These limitations, along with the cost of the platform, may influence opinions of whether TLS is a practical alternative for collecting forest inventory [
26,
27].
Mobile laser scanning (MLS) platforms, which is mobile TLS platforms, capture point clouds by looking horizontally from a height near ground level, where they are held, making them advantageous for capturing details of forest structure from a perspective very similar to human-collected field measurements. MLS platforms can collect information while a person traverses sampling plots with the instrument held in the person's hand or carried in a backpack. Moreover, the development of point clouds that are georeferenced to a local coordinate reference system, using Simultaneous Localization and Mapping (SLAM) algorithms, reduces the need for GNSS. While numerous researchers have illustrated the applicability of MLS for diverse forest inventory tasks [
7,
10,
28,
29], these studies often focus on MLS data collected at the plot-level. Among them, Vatandaşlar et al. [
7] estimated several forest attributes (tree counts, dominant height, basal area, dbh, stand volume, and relative density) within plots located in a near-natural forest landscape. Employing a handheld MLS platform, Vatandaşlar et al. [
7] mapped every stem within each field measurement plot and estimated these attributes with RMSEs ranging between 4.5% and 16.4%. However, as with TLS platforms, the number of sampling plots employed will likely influence the accuracy of forest or stand estimates [
23]. As with any measurement system, the accuracy of forest attribute estimations can be positively correlated with the number of plots measured [
30]. For instance, the diverse forests of the Talladega Division of the Talladega National Forest in the southeastern US cover almost 93,694 ha [
31], and thus the number of TLS or MLS plots needed to describe forest character relatively accurately may be substantial. Therefore, an alternative solution needs to be sought to effectively and efficiently characterize the forest inventory of large, diverse areas such as this.
In this context, airborne laser scanning (ALS) systems have been considered a suitable choice for helping to describe the forest character of broad areas. Notably, the ALS data acquisition process is not constrained by the accessibility restrictions related to TLS including static and mobile platforms [
5]. The versatility of an ALS system allows the collection of information across diverse temporal and spatial scales [
32]. Recent advancements in sensor technologies have further encouraged the adoption of ALS systems, allowing the development of regularly updated and increasingly dense point clouds. For instance, a LiDAR-based forest inventory effort conducted in Ontario two decades ago resulted in a point cloud dataset with about 0.5 points per m
2, yet today the development of a point cloud dataset above 40 points per m
2 can be obtained [
33].
While forest attributes such as tree heights and canopy coverage can be directly estimated from ALS LiDAR data, other forest characteristics such as aboveground biomass, growing stock (tree volume), and basal area can be inferenced from LiDAR-derived metrics [
5]. The development of these estimates relies on modeling methods, which can range in complexity from regression to random forest models and other machine learning techniques [
34]. Distinct models for estimating characteristics of different forest types (conifer, broadleaved, and mixed) might also be developed, rather than a general model that is applicable to an entire forested area. Further research in this domain is necessary, with a specific emphasis on investigating the nature by which additional spectral data can enhance the predictive capability of LiDAR point clouds. Additionally, as mathematical techniques and remotely sensed data evolve, the most effective combination of methods and data sources needs to be assessed.
A map of forest characteristics for an extensive forest area is an ideal outcome of remote sensing methods, yet this outcome is complicated by two underlying factors: the multitude of potential independent variables that can be derived from remotely sensed data, and the potential correlation amongst these which can induce a multicollinearity problem [
35,
36]. Furthermore, a large number of independent variables within predictive models can challenge the application of these models for developing broad scale GIS databases. Consequently, the selection of independent variables during the model development process is important. Tibshirani [
37] suggested a method for developing linear models that estimate forest conditions, while enhancing prediction accuracy by reducing the number of independent variables. Adhikari et al. [
35] recommended the use of ALASSO, as it effectively eliminated highly correlated independent variables from prediction models.
The main goal of this study is to develop models to estimate forest conditions across a broad area from information provided by ALS and aerial imagery. This research effort seeks to: (i) evaluate the quality of predictive models developed using the ALASSO method, (ii) evaluate whether additional remotely sensed data (multispectral aerial imagery) can enhance the quality of predictive models that rely on LiDAR point cloud data, and (iii) determine the suitability of general versus species group-specific models for characterizing mixed coniferous and deciduous forests located in the southern United States. We use the Talladega Division of the Talladega National Forest as our case study area because it represents typical characteristics of natural, pine-dominated forests of southeastern USA. National forest managers do not have comprehensive inventories of the extent under management limiting global decision space for at risk resources be it wildland fire, forest health, endangered species habitat condition to help assess where management is needed or has achieved desired future conditions. Therefore models, maps, and other outcomes derived from predictive models that are based on LiDAR (and other) data may provide forest managers, researchers, and policymakers with valuable insight to monitor and manage forests throughout the southeastern USA.
4. Discussion
Estimating forest attributes such as basal area, volume, and aboveground biomass and updating this information regularly are critical activities for forest management and planning efforts. This study attempted to improve the performance of forest attribute estimations for large areas using LiDAR point clouds and high-resolution, multispectral remotely sensed data. We investigated the effect of different combinations of remotely sensed data (LiDAR-only or LiDAR + NAIP) on the quality of regression models. Also, we evaluated the quality of regression models depending on the classification of sampling plots according to species mixture. To avoid the overfitting issue resulting from multicollinearity between independent variables, the ALASSO method was employed during the modeling process as this method performance was suggested by earlier studies [
35,
43]. Eventually, a total of 12 models were developed for three forest attributes (basal area, volume, aboveground biomass) based on the species mixture (all species and pine), as well as data source (LiDAR-only, LiDAR + NAIP). When the general models are applied to the case study landscape, broad scale maps of these resource conditions can be visualized (
Figure 4,
Figure 5 and
Figure 6).
The R
2adj. values of developed models ranged from 0.71 to 0.84 which were comparable to other ALS study results [
3,
34,
44]. While many researchers have developed regression models for estimating forest attributes over relatively small forested areas [
10,
15,
19,
34,
36], there have been few attempts to work in large areas such as U.S. National Forests. For instance, Leboeuf et al. [
44] mapped a merchantable wood volume of very large area (440,000 km
2) using airborne LiDAR. However, this study had a few limitations, including significant temporal differences between the field measurements (2003-2018) and LiDAR data (2011-2020) collection periods, limited spatial resolution, and poor representativeness of sampling plots. In our study, however, field and remote sensing datasets were collected during a relatively similar period of time (2020-2022). The distribution of sample plots is also important in developing robust regression models for forest attribute estimation. To enhance the quality of regression models, we used a stratified pseudo-random sampling design by classifying the entire operable study area with consideration of recent management activities to improve the balance and representation of the heterogeneity of the forest in the sampling design. Additionally, we surveyed all tree species (both merchantable and non-merchantable > 7.62 cm in dbh) within sampling plots to enhance the accuracy of regression models, as suggested by Brown et al. [
45].
The R
2adj. values and R
2 values of pine models were higher than those of the general models regardless of forest attributes. The overall quality metrics also indicated pine models were higher compared to the general models. Bouvier et al. [
3] also confirmed that separate models may result in higher accuracy when compared to general models. Regarding forest attributes, the highest R
2adj. values were observed in tree volume models and the lowest R
2adj. values were observed in basal area models regardless of data sources and sampling plots (all plots or pine plots). This trend has also been observed in other studies using ALS systems [
3,
34,
46,
47]. Sumnall et al. [
48] suggested that basal area models have relatively lower R
2 values than models estimating tree height and biomass. This is likely because, unlike tree height, dbh cannot be directly measured with ALS systems. Dbh is used to estimate basal area, yet dbh and tree height are often needed to estimate volume and aboveground biomass [
3], therefore it may be logical to observe that the basal area estimation models would underperform the volume and biomass models.
There have been many attempts recently to enhance the performance of regression models for estimating forest conditions. They generally involve the development of LiDAR metrics [
3,
49], and perhaps supplementary remotely sensed data [
34,
45]. In this study, the vegetation indices derived from NAIP imagery were included as supplemental data. As the additional NAIP imagery had higher spatial resolution (0.3 m), we expected the vegetation indices might improve the quality of regression models to a considerable extent. However, it was observed that the addition of NAIP-derived vegetation metrics was not very influential in improving the quality of prediction models. Although LiDAR-only derived models had slightly larger RMSE values and slightly lower R
2 values compared to the LiDAR + NAIP derived models, the increase in the number of independent variables with the addition of NAIP metrics led us to conclude that LiDAR-only models were more appropriate for broad-scale mapping efforts.
Of the 74 metrics derived from LiDAR and NAIP data sources, 45 were selected as independent variables in at least one regression model. Further, five LiDAR-based metrics and two NAIP-based metrics were selected for every regression model (
Table 7). These were, specifically, the LiDAR-based metrics pzabove2, zq95, iskew, ikurt, and p2th, and the NAIP-based metrics NDVI
MIN and EVI
PCT90. LiDAR-based metrics pzabove2 and p2th metrics help eliminate the effect of understory vegetation that is not measured in a typical forest inventory survey. The LiDAR-based metric zq95 is widely used to represent forest canopy height. In addition to height related LiDAR metrics, intensity related metrics, such as iskew and ikurt, were also crucial in developing regression models as they provide information related to stand density. The inclusion of NAIP-based NDVI
MIN and EVI
PCT90 can be explained by the concentration of active chlorophyll in pine tree crowns. Ozkan et al. [
34] also confirmed that metrics such as these may be significantly correlated with forest attributes.
Although inclusion of NAIP-based metrics improved the accuracy of the regression models we developed, we suggest that LiDAR-derived metrics may be sufficient for developing robust regression models used for operational forest management purposes. Interestingly, the general trend in the spatial pattern of basal area, volume, and aboveground biomass is due to the high correlation among these forest attributes (
Figure 4,
Figure 5 and
Figure 6). Since aboveground biomass of a forest is typically a derivation of growing stock, volume estimates often rely on dbh measurements, and as basal area is directly related to dbh, these relationships may help in developing simple ratios between forest conditions without the need for additional, elaborate regression models. While this may potentially decrease the reliability of some of the outcomes, the practicality of the overall effort may be worth investigating further.
Figure 2.
The observed forest attributes versus the predicted forest attributes of the general models (n=254). (LiDAR+NAIP: A, C, E; LiDAR: B, D, F).
Figure 2.
The observed forest attributes versus the predicted forest attributes of the general models (n=254). (LiDAR+NAIP: A, C, E; LiDAR: B, D, F).
Figure 3.
The observed forest attributes versus the predicted forest attributes of the pine models (n=149). (LiDAR+NAIP: A, C, E; LiDAR: B, D, F).
Figure 3.
The observed forest attributes versus the predicted forest attributes of the pine models (n=149). (LiDAR+NAIP: A, C, E; LiDAR: B, D, F).
Figure 4.
Estimated basal area for a portion of the study area based on general model (Sources: Esri. “World Topographic Map” [basemap]. January 31, 2024).
Figure 4.
Estimated basal area for a portion of the study area based on general model (Sources: Esri. “World Topographic Map” [basemap]. January 31, 2024).
Figure 5.
Estimated volume per hectare for a portion of the study based on general model (Sources: Esri. “World Topographic Map” [basemap]. January 31, 2024).
Figure 5.
Estimated volume per hectare for a portion of the study based on general model (Sources: Esri. “World Topographic Map” [basemap]. January 31, 2024).
Figure 6.
Estimated aboveground biomass for a portion of the study area based on general model (Sources: Esri. “World Topographic Map” [basemap]. January 31, 2024).
Figure 6.
Estimated aboveground biomass for a portion of the study area based on general model (Sources: Esri. “World Topographic Map” [basemap]. January 31, 2024).
Table 1.
Summary of vegetation indices calculated from the NAIP images.
Table 1.
Summary of vegetation indices calculated from the NAIP images.
Vegetation Index |
Equation |
Calculated statistics and its abbreviation |
Greenness |
|
Minimum of greenness (GMIN) |
Maximum of greenness (GMAX) |
Range of greenness (GRANGE) |
Mean of greenness (GMEAN) |
Standard deviation of greenness (GSTD) |
Sum of greenness (GSUM) |
Median of greenness (GMEDIAN) |
90 percentage of greenness (GPCT90) |
Normalized Difference Vegetation Index, NDVI |
|
Minimum of NDVI (NDVIMIN) |
Maximum of NDVI (NDVIMAX) |
Range of NDVI (NDVIRANGE) |
Mean of NDVI (NDVIMEAN) |
Standard deviation of NDVI (NDVISTD) |
Sum of NDVI (NDVISUM) |
Median of NDVI (NDVIMEDIAN) |
90 percentage of NDVI (NDVIPCT90) |
Enhanced Vegetation Index, EVI |
|
Minimum of EVI (EVIMIN) |
Maximum of EVI (EVIMAX) |
Range of EVI (EVIRANGE) |
Mean of EVI (EVIMEAN) |
Standard deviation of EVI (EVISTD) |
Sum of EVI (EVISUM) |
Median of EVI (EVIMEDIAN) |
90 percentage of EVI (EVIPCT90) |
Table 2.
Summary of the independent variables derived from LiDAR point cloud.
Table 2.
Summary of the independent variables derived from LiDAR point cloud.
Metrics |
Descriptions |
Metrics |
Descriptions |
zmean |
Mean height |
zpcum x (from 1st to 9th) |
Cumulative percentage of return in the ith layer |
zsd
|
Standard deviation of height distribution |
isd |
standard deviation of intensity |
zskew
|
Skewness of height distribution |
iskew |
skewness of intensity distribution |
zkurt |
Kurtosis of height distribution |
ikurt |
kurtosis of intensity distribution |
zentropy |
Entropy of height distribution |
ipground |
percentage of intensity returned by points classified as "ground" |
pzabovezmean |
Percentage of returns above z mean |
ipcumzq x (10th, 30th, 50th, 70th, and 90th) |
Percentage of intensity returned below the xth percentile of height |
Pzabove2
|
Percentage of returns above 2 m |
P xth (1, 2, 3, 4, and 5) |
Percentage xth returns |
zq x (From 5th to 95th) |
xth percentile (quantile) of height distribution |
pground |
Percentage of returns classified as "ground" |
Table 3.
Descriptive statistics of forest attributes based on field measurements.
Table 3.
Descriptive statistics of forest attributes based on field measurements.
|
Diameter at breast height (cm) |
Basal area (m2 ha-1) |
Volume (m3 ha-1) |
Aboveground biomass (Mg ha-1) |
All plots (n =254) |
|
|
|
|
Average |
22.39 |
23.43 |
180.85 |
40.13 |
Standard deviation |
7.39 |
10.95 |
105.48 |
23.07 |
Minimum |
8.65 |
0.33 |
0.77 |
0.13 |
Maximum |
54.36 |
53.29 |
569.93 |
119.03 |
Pine plots (n =149) |
|
|
|
|
Average |
22.42 |
22.28 |
168.97 |
34.62 |
Standard deviation |
8.40 |
16.92 |
108.26 |
21.94 |
Minimum |
8.66 |
0.33 |
0.77 |
0.13 |
Maximum |
54.36 |
51.19 |
543.27 |
110.02 |
Table 4.
The best equations for the general (all plots, n=254) and the pine (pine plots, n=149) models by forest attribute and data sources.
Table 4.
The best equations for the general (all plots, n=254) and the pine (pine plots, n=149) models by forest attribute and data sources.
Forest variables |
Data sources |
eequation
|
General models |
|
Basal area |
LiDAR+ NAIP |
-1.828 + 0.017 * pzabove2 + 0.015 * zq25 + 0.017 * zq95 - 7.83 * e-4 * zpcum5 - 0.002 * zpcum6 + 2 * e-5 * isd + 0.24 * iskew - 0.091 * ikurt + 0.024 * ipcumzq90 + 0.043 * p2th + 0.116 * GMIN + 0.52 * NDVIMIN + 0.428 * NDVIMEDIAN + 3.58 * e-5 * EVIMAX - 0.012 * EVIPCT90
|
|
LiDAR |
-1.197 + 0.018 * pzabove2 + 0.017 * zq25 + 1.13 * e-4 * zq30 + 0.015 * zq95 - 3.79 * e-4 * zpcum5 - 0.003 * zpcum6 + 1.83 * e-5 * isd + 0.196 * iskew - 0.105 * ikurt + 0.017 * ipcumzq90 + 0.046 * p2th |
Volume |
LiDAR+ NAIP |
-0.148 + 0.018 * pzabove2 + 0.004 * zq25 + 0.056 * zq95 - 7.91 * e-4 * zpcum5 - 0.00566 * zpcum6 + 2.51 * e-5 * isd + 0.237 * iskew - 0.128 * ikurt - 0.018 * ipcumzq10 - 0.002 * ipcumzq30 + 0.029 * ipcumzq90 - 0.002 * p1th + 0.029 * p2th + 0.091 * GMIN - 0.165 * GRANGE + 0.522 * NDVIMIN + 0.229 * NDVIMEAN + 1.45 * e-5 * NDVISUM + 0.158 * NDVIMEDIAN + 1.96 * e-4 * EVIMAX - 0.010 * EVIPCT90
|
|
LiDAR |
0.380 + 0.020 * pzabove2 + 0.007 * zq25 + 0.053 * zq95 - 0.006 * zpcum6 + 3.088 * e-05 * isd + 0.209 * iskew - 0.121 * ikurt - 0.020 * ipcumzq10 - 0.004 * ipcumzq30 + 0.020 * ipcumzq90 - 4.642 * e-5 * p1th + 0.037 * p2th |
Aboveground biomass |
LiDAR+ NAIP |
0.033 + 0.021 * pzabove2 + 0.062 * zq95 - 0.004 * zpcum6 + 3.93 * e-5 * isd + 0.132 * iskew - 0.184 * ikurt - 0.011 * ipcumzq10 + 0.083 * ipcumzq90 - 0.006 * p1th + 0.031 * p2th - 1.51 * e-4 * pground + 0.385 * GMIN + 0.505 * NDVIMIN + 0.218 * NDVIMEAN + 9.36 * e-6 * NDVISUM - 0.005 * EVIPCT90
|
|
LiDAR |
0.516 + 0.022 * pzabove2 + 0.238 * zq5 + 0.002 * zq25 + 0.059 * zq95 - 0.005 * zpcum6 + 3.83 * e-5 * isd + 0.103 * iskew - 0.198 * ikurt - 0.012 * ipcumzq10 + 0.078 * ipcumzq90 - 0.006 * p1th + 0.034 * p2th |
Pine models |
|
Basal area |
LiDAR+ NAIP |
0.657 + 0.009 * pzabove2 + 3.600 * zq5 + 0.021 * zq25 + 0.001 * zq40 + 0.007 * zq95 - 0.009 * zpcum5 + 0.173 * iskew - 0.067 * ikurt + 0.057 * p2th + 0.426 * NDVIMIN + 0.985 * NDVIMEDIAN - 8.6 * e-4 * EVIPCT90
|
|
LiDAR |
5.195 + 0.011 * pzabove2 + 2.873 * zq5 + 0.021 * zq25 + 1.681 * e-4 * zq40 + 0.003 * zq95 - 0.008 * zpcum5 - 1.96 * e-4 * zpcum6 + 0.168 * iskew - 0.061 * ikurt - 0.004 * ipcumzq30 - 0.0475 * ipcumzq90 + 0.057 * p2th |
Volume |
LiDAR+ NAIP |
1.593 - 0.002 * zkurt + 0.021 * pzabove2 + 8.41 * zq5 - 1.91 * zq15 + 0.019 * zq25 + 0.006 * zq40 - 0.008 * zq65 + 0.016 * zq75 - 0.071 * zq80 + 0.096 * zq95 + 0.010 * zpcum1 - 0.010 * zpcum5 - 0.005 * zpcum6 + 1.29 * e-5 * zpcum8 + 0.002 * zpcum9 + 3.04 * e-5 * isd + 0.241 * iskew - 0.16 * ikurt + 0.016 * ipcumzq10 - 0.008 * ipcumzq30 + 0.042 * p2th - 0.039 * p5th + 0.486 * GMIN - 0.132 * GRANGE + 0.475 * NDVIMIN + 1.28 * NDVIMEDIAN + 2.7 * e-4 * EVIMAX - 0.004 * EVISTD - 0.019 * EVIMEDIAN - 0.015 * EVIPCT90
|
|
LiDAR |
3.592 + 0.008 * pzabove2 + 3.401 * zq5 + 0.004 * zq25 + 0.043 * zq95 - 0.013 * zpcum5 + 0.222 * iskew - 0.094 * ikurt - 2.769 *e-4 * ipcumzq10 - 0.033 * ipcumzq30 + 0.047 * p2th + 0.013 * p3th |
Aboveground biomass |
LiDAR+ NAIP |
6.466 - 0.005 * zkurt + 0.41 * zentropy + 0.025 * pzabove2 + 8.66 * zq5 - 0.037 * zq10 - 2.18 * zq15 + 0.017 * zq25 - 7.48 *e-4 * zq30 + 0.004 * zq40 - 0.001 * zq65 + 0.010 * zq75 - 0.079 * zq80 + 0.11 * zq95 + 0.013 * zpcum1 - 0.009 * zpcum5 - 0.005 * zpcum6 + 0.003 * zpcum9 + 4.75 * e-05 * isd + 0.078 * iskew - 0.175 * ikurt + 0.020 * ipcumzq10 - 0.008 * ipcumzq90 + 0.042 * p2th - 0.016 * p5th + 0.673 * GMIN - 0.094 * GRANGE + 0.171 * GSTD - 0.362 * GMEDIAN + 0.364 * NDVIMIN - 0.327 * NDVISTD + 1.51 * NDVIMEDIAN + 1.52 *e-4 * EVIMAX - 0.002 * EVISTD - 0.033 * EVIPCT90
|
|
LiDAR |
10.699 + 0.013 * pzabove2 + 2.062 * zq5 + 0.052 * zq95 - 0.011 * zpcum5 + 0.010 * iskew - 0.129 * ikurt - 0.010 * ipcumzq30 - 0.024 * ipcumzq90 - 0.009 * p1th + 0.045 * p2th |
Table 5.
Summary of statistics for the general models.
Table 5.
Summary of statistics for the general models.
Quality metrics |
Basal area (m2 ha-1) |
Total volume (m3 ha-1) |
Total aboveground biomass (Mg ha-1) |
LiDAR + NAIP |
LiDAR |
LiDAR + NAIP |
LiDAR |
LiDAR + NAIP |
LiDAR |
R2adj. |
0.72 |
0.71 |
0.77 |
0.77 |
0.73 |
0.72 |
# of variables |
15 |
11 |
21 |
12 |
16 |
12 |
RMSE |
5.58 |
5.73 |
48.44 |
49.34 |
11.68 |
11.84 |
R2
|
0.74 |
0.72 |
0.79 |
0.78 |
0.74 |
0.74 |
Bias |
- 0.78 |
-0.80 |
-6.27 |
-6.54 |
-1.62 |
-1.64 |
Bias (%) |
-3.33 |
-3.40 |
-3.45 |
-3.61 |
-4.03 |
-4.08 |
AIC |
0.08 |
-76.05 |
-118.05 |
-137.76 |
-137.43 |
-145.17 |
BIC |
-68.19 |
-38.28 |
-47.95 |
-96.66 |
-83.13 |
-104.02 |
CP |
-17.21 |
0.08 |
0.11 |
0.12 |
0.12 |
0.13 |
Table 6.
Analysis of 10-fold cross-validation for the general regression models.
Table 6.
Analysis of 10-fold cross-validation for the general regression models.
Quality metrics |
Basal area (m2 ha-1) |
Total volume (m3 ha-1) |
Total aboveground biomass (Mg ha-1) |
LiDAR + NAIP |
LiDAR |
LiDAR + NAIP |
LiDAR |
LiDAR + NAIP |
LiDAR |
R2adj.
|
0.69 |
0.71 |
0.67 |
0.73 |
0.64 |
0.65 |
R2
|
0.72 |
0.69 |
0.72 |
0.75 |
0.69 |
0.68 |
RMSE |
5.90 |
5.91 |
55.87 |
53.10 |
13.05 |
13.09 |
Bias |
-0.76 |
-0.75 |
-4.06 |
-7.56 |
-1.12 |
-1.21 |
Bias (%) |
-3.20 |
-3.20 |
-2.26 |
-4.12 |
-2.63 |
-2.85 |
Table 7.
The most important independent variables of the best regression models.
Table 7.
The most important independent variables of the best regression models.
|
LiDAR metrics |
NAIP metrics |
General & pine model |
pzabove2, zq95, iskew, ikurt, p2th |
NDVIMIN, EVIPCT90
|
General model |
zpcum6, isd, ipcumzq90 |
GMIN
|
Pine model |
zq5, zpcum5 |
NDVIMEDIAN
|
Table 8.
Summary of statistics for the pine regression models.
Table 8.
Summary of statistics for the pine regression models.
Quality metrics |
Basal area (m2 ha-1) |
Total volume (m3 ha-1) |
Total aboveground biomass (Mg ha-1) |
LiDAR + NAIP |
LiDAR |
LiDAR + NAIP |
LiDAR |
LiDAR + NAIP |
LiDAR |
R2adj.
|
0.81 |
0.80 |
0.84 |
0.82 |
0.83 |
0.82 |
# of variables |
12 |
12 |
30 |
11 |
34 |
10 |
RMSE |
4.80 |
5.10 |
37.86 |
43.45 |
7.89 |
8.93 |
R2
|
0.83 |
0.81 |
0.87 |
0.84 |
0.87 |
0.83 |
Bias |
-0.72 |
-0.75 |
-3.65 |
-6.71 |
-0.68 |
-1.36 |
Bias (%) |
-3.23 |
-3.38 |
-2.12 |
-3.92 |
-1.93 |
-3.89 |
AIC |
-5.84 |
-54.77 |
-56.60 |
-108.83 |
-49.23 |
-113.69 |
BIC |
-22.09 |
-21.01 |
17.08 |
-77.80 |
31.20 |
-85.33 |
CP |
0.07 |
0.08 |
0.06 |
0.11 |
0.06 |
0.10 |
Table 9.
Analysis of 10-fold cross-validation for the pine regression models.
Table 9.
Analysis of 10-fold cross-validation for the pine regression models.
Quality metrics |
Basal area (m2 ha-1) |
Total volume (m3 ha-1) |
Total aboveground biomass (Mg ha-1) |
LiDAR + NAIP |
LiDAR |
LiDAR + NAIP |
LiDAR |
LiDAR + NAIP |
LiDAR |
R2adj.
|
0.79 |
0.75 |
0.78 |
0.79 |
0.80 |
0.78 |
R2
|
0.82 |
0.78 |
0.82 |
0.81 |
0.84 |
0.81 |
RMSE |
5.21 |
5.71 |
47.44 |
48.94 |
9.42 |
10.24 |
Bias |
-0.72 |
-0.99 |
-4.41 |
-8.43 |
-0.97 |
-0.92 |
Bias (%) |
-3.17 |
-4.33 |
-2.48 |
-4.92 |
-2.21 |
-2.71 |