1. Introduction
Malaysia is considered one of the highest forest carbon countries in the world due to its significant forested areas and the carbon-rich nature of its forests. Several factors contribute to Malaysia’s status as a country with substantial forest carbon, that include: home to vast tropical rainforests, high plant diversity, has extensive peat swamp forests and extensive mangrove ecosystems along its coastlines. While Malaysia’s forests are rich in carbon, they have faced challenges such as deforestation, habitat loss, and land-use change due to factors like palm oil production and logging [
1]. Efforts to balance economic development with forest conservation are ongoing, and the preservation of these carbon-rich ecosystems is of global importance in the fight against climate change. Malaysia has implemented various conservation measures and forest management practices to protect its forests and their carbon stocks [
2]. This includes establishing protected areas and national parks. Considering these circumstances, forests in Malaysia are highly diverse in terms of stand conditions and thus biomass carbon.
Malaysia, like many other countries, recognizes the importance of its forests in mitigating climate change. The country has made commitments under international agreements like the United Nations Framework Convention on Climate Change (UNFCCC) to reduce emissions from deforestation and forest degradation (REDD+) [
3]. Malaysia has also been involved in carbon offset projects, where the country can earn carbon credits by reducing deforestation and forest degradation, as well as implementing reforestation and afforestation initiatives.
In addition to being an essential part of forest ecosystems, forest biomass is also important for mitigating climate change, storing carbon, and preserving biodiversity [
4]. The integration of statistical data with geospatial information boosts the power of data, resulting in a much greater understanding of social, economic, and environmental issues, than viewing the statistical or geospatial information in isolation [
5]. Accurate assessment and forecast of forest biomass are essential for understanding the effects of climate change, managing forests, and accounting for carbon emissions. Remote sensing technologies, like satellite data from Landsat have revolutionised the prediction of forest biomass by providing crucial insights into the characteristics of forests and changes in land cover.
Landsat satellites, launched by NASA and the U.S. Geological Survey, have been providing high-resolution and multispectral imagery of the Earth’s surface since 1972 [
6]. Landsat data have been widely employed for various environmental and land use applications due to their long-term data archive, consistent data quality, and global coverage [
7]. Landsat satellites capture data in different spectral bands, allowing researchers to analyse land cover, vegetation, and biomass across diverse landscapes. The entire historical Landsat archive has been opening for public access since 2008 [
8]. As such, the Landsat archive has become one of the most valuable and cost-effective remotely sensed data sources supporting worldwide land/forest research and monitoring activities.
Among the advantages of using Landsat for biomass estimations are [
9]: (i) large coverage from specific landscapes, regional to global scales, (ii) temporal and spatial scales; provide the advantage of temporal consistency, allowing for long-term biomass change monitoring in specific time-series, and (iii) sensitivity to environmental changes, which Landsat data can capture changes in forest biomass due to factors like disturbances (e.g., forest fires and logging) and climate-related stressors. This sensitivity enables better understanding of the impacts of these changes on forest ecosystems.
Although Landsat data is a valuable resource for monitoring and estimating forest biomass, it has some challenges when it comes to biomass estimation in tropical regions, especially Malaysia. Among the biggest challenges are cloud cover [
10]. Estimating forest biomass from optical satellite data is also difficult due to several reasons. One of the main reasons is that optical sensors are sensitive to the amount of light reflected by the vegetation, which is influenced by the structure and density of the forest canopy. However, the relationship between the amount of light reflected and the biomass is not straightforward, as it can be affected by factors such as the species composition, age, and health of the trees [
11]. Moreover, clouds and atmospheric conditions can interfere with the accuracy of optical data acquisition, which can lead to incomplete or inconsistent data [
12]. Another important limiting factor to direct biomass carbon modelling lies in the lack of repeated and coincident field reference data at different times [
13].
Several attempts have been placed to overcome these limitations and the approaches taken can be categorised into two, which are (i) diversifying uses of spectral and vegetation indices [
14] and (ii) applying machine learning and statistical models [
15]. These indices are used as predictor variables to estimate forest biomass, indirectly. Machine learning techniques [
16,
17], including Random Forest [
18], Support Vector Machines (SVM), artificial neural network (ANN) [
19], and regression models, have been combined with Landsat data to predict forest biomass. These models use spectral information, vegetation indices, and other environmental variables to establish relationships between the data and biomass estimates [
20]. These techniques have demonstrated their efficiency in predicting forest biomass at various scales, from local to regional. Another popular solution is to combine Landsat-based data with datasets from other sensors [
14], both optical and synthetic aperture radar (SAR) [
21,
22,
23,
24] and even integrate with light detection and ranging (LiDAR)-based data [
25,
26]. Eventually, each approach offers different levels of difficulties and challenges.
This study aimed at producing reliable AGC estimates at national scale, pixel-based, wall-to-wall at acceptable spatial resolution produced from a single satellite with consistent observations that is able to represent the forest types and physical conditions of the forests over time. Google Earth Engine (GEE) platform was used to derive the Aboveground Carbon Density Indicator (ACDI), to conduct the correlation, and to produce seamless mosaic images over Malaysia. The estimated AGC is mapped at 30-m pixel resolution for the entire forests across Malaysia. The map helps in quantifying the carbon stored as biomass at any location. It also aids in pin-pointing areas that have low AGC or degraded areas, which are becoming increasingly important for baseline development for carbon-related, nature-based solution approaches in dealing with various climate change mitigation initiatives such as nationally determined contribution (NDC) under Paris Agreement and carbon offsetting for industrial sectors.
2. Materials and Methods
2.1. The Study Area
This study was conducted over the entire forests in Malaysia. Malaysia is a country in Southeast Asia, located just north of the Equator. It is composed of two non-contiguous regions: Peninsular Malaysia and East Malaysia. The country has a total area of about 330,803 km
2. Malaysia currently has about 18 million ha of forests [
27]. These forests are rich with diverse flora and fauna species. Major forest types in Malaysia are lowland dipterocarp forest, hill dipterocarp forest, upper hill dipterocarp forest, oak-laurel forest, montane ericaceous forest, peat swamp forest and mangrove forest. In addition, there are also smaller areas of freshwater swamp forest, Melaleuca swamp forest, heath forest, transitional forest, forest on limestone and forest on quartz ridges. Considering the composition of these forests in Malaysia, the types can be generalised into three types, which are dry inland, peat swamp and mangrove forests.
Timber production is also one of the commodities in Malaysia where State Governments are depending greatly on the forest resources for generating and sustaining the economy [
28]. Malaysia is practising sustainable forest management (SFM) to balance timber production with conservation efforts. This approach aims to maintain forest carbon stocks while allowing for responsible logging. Harvesting only for merchantable timbers at certain controlled cutting limits. There are also forest plantations, established with certain timber tree species, developed to support timber supplies and meet the industrial demands.
2.2. Methodology
The framework of methodology was developed based on six major pillars, which are (i) collection of field datasets at sample plots, (ii) derivation of ACDI, (iii) correlation analysis, (iv) production of seamless mosaic images, (v) forest delineation and forest types classification, and (vi) map production. The first challenge was to match the field data collection date with the derived ACDI from the Landsat images. Google Earth Engine was used to execute this calculation. Cloud cover was another issue to deal with when working with Landsat data, as it can obscure the land surface and affect the quality of image analysis. To address the cloud cover problem in Landsat data over Malaysia, GEE was again used. GEE provides a powerful tool for mapping and analysing geospatial data, including the use of regression to identify trends in data and create ACDI. In brief, steps ii and iv above were performed on the GEE platform, while the remaining processes were conducted separately by using image processing and GIS software, i.e., ERDAS Imagine®, Exelis ENVI Software, and Esri’s ArcGIS Desktop.
Figure 1.
Flowchart of the methodology adopted in the study.
Figure 1.
Flowchart of the methodology adopted in the study.
2.3 Collection of Field Inventory Data
Sampling work has been started since 2012 at several locations focused on lowland and hill dipterocarp forests in Peninsular Malaysia [
29]. The work was carried out occasionally depending on available research projects that have been undergoing since then until year 2023, covering all forest types in Malaysia (
Table 1). The applied forest inventory design was stratified random, where sampling plots were distributed according to the forest types and covering all stands conditions of the forests (i.e., virgin forest, totally protected areas, logged forests, secondary forest, and degraded areas). This was considered to ensure all variations of biomass carbon are captured in the samples. Locations of the sample plots are depicted in
Figure 8.
The sampling design in this innovation was a modified sampling design according to the standard operating procedure (SOP) that has been developed by Winrock International [
30], which follows the IPCC standards [
31]. The design that produced the highest accuracy of the forestry parameters was then modified and developed for forest stands conditions suitable for Malaysia’s environment and management practices [
32,
33]. The sampling designs are divided into three, which are corresponding to dry inland forest, peat swamp forest and mangrove forest. The design of the sampling plots was done in clusters. In cluster sampling, a random sample of clusters is chosen after the population is split up into groups according to the types of forests and strata. Cluster sampling is a probability sampling method used when the population is large and geographically dispersed.
2.3.1. Design for dry inland forest
A cluster comprises four sampling plots and the distance between plots is 100 m as shown in
Figure 2. The plot was designed in a circular shape with smaller nests inside. The biggest nest measures 20 m in radius, followed by the smaller nests measuring 12 m and 4 m (
Figure 3). The sizes of trees are measured according to the nest sizes, which is summarised in
Table 2. Depending on the nest size, it indicates that not all stands are measured in a single plot. In addition to these nests, there is another small nest measuring 2 m in radius, which is used to count the saplings (i.e., trees measuring < 10 cm in diameter at breast height (dbh) and ≥ 1.3 m in height). The clustering of multiple plots at one sampling unit allows field crews to sample a larger area per sampling point. The sampling system is designed in a way to make the data collection processes easier, faster, reliable and representative for a forest stratum. The distance of the tree stand is controlled by using a Distance Measurement Equipment (DME) that utilises sonar waves to communicate with a transponder that is installed at centre of the plot. Therefore, in reality the nests with particular radius do not exist on the ground.
2.3.2. Design for dry peat swamp forest
Peat swamp forests are terrestrial wetland ecosystems with low nutrient levels and highly acidic soil (pH less than 4.0) [
34]. Ecologically, peat swamp forests have organic soil horizons, or peat that can receive water and nutrients exclusively from flooding and groundwater or from rainfall. In the tropics, peat formation is influenced by high rainfall rates, minimal drainage, and high temperatures with little seasonal change. According to [
35], peat swamp forests are typically submerged during the rainy season, which encourages anaerobic conditions that influence the rates and pathways of decomposition and accumulation. Peat soils are described as having at least 50 centimetres of thickness and a content of organic matter greater than 65% in tropical ecosystems [Rieley & Page]. The peat swamp forests ecosystem is uniquely different from inland and mangrove forests. Therefore, the sampling design of peat swamp forests is differently from that of other forests. However, the approach and concept for field data collection and sampling is similar. The sampling technique for peat swamp forest is adopted from [
36] and the layout of sampling plots is depicted in
Figure 5. The sizes of trees are measured according to the nest sizes, which is summarised in
Table 3.
Figure 4.
Layout of a cluster for peat swamp forests.
Figure 4.
Layout of a cluster for peat swamp forests.
2.3.3. Design for dry peat swamp forest
Mangroves are defined as an association of halophytic trees, shrubs and other plants growing in brackish to saline tidal waters of tropical and subtropical coastlines [
37]. Mangroves are generally restricted to the tidal zone. As such, mangroves in fringe areas will be inundated by practically all high tides, while those at the higher topographic boundaries may be flooded only during the highest of tides (spring tides) or during storm surges. Mangroves are typically found along tropical and subtropical coastlines between about 25° N and 25° S.
Mangrove is another forest ecosystem that is totally different compared to inland and peat swamp forests. Mangrove forest has its own habitat, which is unique in terms of ecology, standing structure and species composition. Therefore, the sampling method for mangrove forest is designed specifically for the mangroves. However, the approach and concept of field data collection is similar to that of peat swamp forests. The sampling can be organised in a cluster, comprising 6 plots (
Figure 6). The sampling technique for mangrove forest is adopted from [
38] and the layout of sampling plots is depicted in
Figure 7. The sizes of trees were measured according to the nest sizes, which is summarised in
Table 4.
Estimation of biomass carbon was based on the published allometric equations found in the literature, suitable to the corresponding types of forests in Malaysia. Aboveground biomass (AGB) of the sampled trees in the sample plots were first estimated before the values were converted to AGC. The estimation of AGB that was calculated at tree-level was converted to the plot-level, where the measurement is reported in mass, in Megagram (Mg) or metric tonne per-hectare basis, Mg ha
-1. This estimation was then converted into a biomass carbon unit of AGC by multiplying the AGB with 0.47, which is the constant carbon fraction [
31], and reported in Mg C ha
-1.
The estimation of AGB on dry inland forest was calculated based on an allometric equation that was developed by [
39] for inland forest. The allometric equation is expressed as follow;
where AGB denotes the estimated biomass of a tree (kg tree
-1), D is diameter at breast height (dbh) of each tree (cm),
ρ is wood specific gravity or wood density (typical average value for all Southeast Asia’s tree species is 0.57 g cm
-3 [
40]), and E is bioclimatic variable, which is available at
http://chave.upstlse.fr/pantropicalallometry.htm
The allometric equation for the estimation of AGB in peat swamp forest can be referred to [
36], which is expressed as
and the allometric equation adopted for the calculation of AGB in mangrove forest is expressed as [
38]
where
ρ is wood specific gravity or wood density (average value for all mangroves tree species is 0.752 g cm
-3).
2.4. Production of Seamless Mosaics, Cloud-Free Images over Malaysia
The production of cloud free images at national level requires substantial amount of time and resources to achieve it. While conventional methods offer flexibility and control over processing, they are often time-consuming and may be impractical for large-scale projects. Google Earth Engine streamlines the entire process, making it efficient, scalable, and accessible for a wide range of users. The production of cloud-free images over Malaysia was done using GEE. In this study, a Top-of-Atmosphere’s (ToA) cloud-free mosaic image for Malaysia in the year 2023 was generated using Landsat 8 and Landsat 9 satellite imagery obtained from the "LANDSAT/LC08/C02/T1_TOA" and "LANDSAT/LC09/C02/ T1_TOA" collections. The use of Landsat-8 and -9 imagery is to reduce cloud cover since Malaysia is located at the equatorial region and always covered by the clouds all the time.
The first step in generating cloud-free images over Malaysia was selecting images specifically for the year 2023 that covers Malaysia from the Landsat-8 and -9 image collections. This step is to ensure that only relevant imagery over the study area for the year 2023 is considered. A cloud masking approach was applied to the selected images using the "QA_PIXEL" band. This band was used to mask pixels containing dilated clouds, cirrus clouds, and cloud shadows. This cloud masking process was crucial for excluding cloudy or obscured pixels, resulting in a cleaner and more accurate composite image. The composite image was generated using the median value for each pixel across the selected cloud-masked images. The median composite method was chosen because it is simple for calculation and its robustness against outliers and its ability to reduce the influence of noise and artifacts in the final image. Finally, the cloud-free mosaic image for Malaysia in the year 2023 was created by mosaicking the individual median composite images.
2.5. Forest Cover and Types Classifications
Forest is defined as “a portion of land larger than 0.5 ha and has trees with a height of more than five (5) metres and has a tree canopy cover of more than 10 percent or with trees that can meet these criteria”. This definition is based on the UN Food and Agriculture Organization’s (FAO) definition of a forest, which is adopted by the Malaysian government Laws of Malaysia - National Forestry Act 1984 (Amended, 2006). However, there are different types of forests in Malaysia, such as inland mixed dipterocarp forest, peat swamp forest, and mangrove forest, which have different characteristics and functions. Therefore, the definition of a forest may vary depending on the context and the purpose of the classification. Inland mixed dipterocarp forest, which is divided into several layers according to the land elevations, i.e., lowland dipterocarp forest (< 300 m), hill dipterocarp forest (300 - 750 m), upper-hill dipterocarp forest (750 - 1200 m), oak-laurel forest (1200 – 1500 m), montane ericaceous forest (>1500 m), are dominant in Malaysia [
41]. All dryland forests are included in this category. It includes all primary and secondary forests that meet the defined threshold. It would, thus, also include the dwarf Montane and Sub-Montane forests growing on the thin soils of mountain summits and ridges of the interior of the peninsula. The dry inland forest in Malaysia is mostly dominated by trees from the Dipterocarpaceae family, hence the term ‘dipterocarp’ forests. The dipterocarp forest occurs on dry land just above sea level to an altitude of about 900 m. The dipterocarp specifically refers to the fact that most of the largest trees in this forest belong to one plant family known as Dipterocarpaceae. It was so called because their fruits have seeds with two wings (di = two; ptero = wing; carp = seed) [
42]. This forest is also generally referred to as inland forest.
The peat swamp forest refers to tropical and subtropical forest areas behind the swampy forest to the land where peatlands and less salty soils are present. This tropical swamp forest is a unique wetland ecosystem and is a combination of two peat swamp forests and a growing tropical rainforest for thousands of years. On the other hand, mangrove refers to coastal and estuarine areas where the forest is influenced by tidal waves. Tidal forest where the genera Rhizophora, Bruguiera and Avicennia are most common. Mangrove trees refer to plants living in swampy areas at the mouth of the river, between clashes of freshwater and seawater.
Smaller sections of casuarina/beach forest, freshwater swamp forest, melaleuca swamp forest, heath forest, limestone forest, and quartz ridge forest are also present. In Sabah, there is another type of vegetation zone, known as sub-alpine vegetation, which occurs only at the elevation of > 3500 m a.s.l., at the peak of Kinabalu Mount [
32].
In this study, forests are divided into three major ecosystem types: inland mixed dipterocarp forest, peat swamp forest, and mangrove forest. Before interpreting and classifying forests on the Landsat images, it is important to understand the situation and management practices of Malaysia’s forest sector. Having a variety of secondary data on hand is advantageous and can speed up the classification process. To ensure that the classification is done correctly, spatial information such as Permanent Reserve Forest (PRF) boundaries, management regimes, and locations of various ecosystems are necessary. In this case, the image classification was performed to delineate forests from other land features. Image classification was executed on the seamless mosaic image to delineate these forest types. The training areas were manually created based on visual interpretation aided by the sampling plots information. Maximum likelihood image classification algorithm was utilised to execute the classification.
The most difficult aspect of image classification was dealing with large amounts of data and producing classification results with minimum uncertainty [
43]. Pixel format classification results have been converted to shapefile vector format (.shp) for further analysis and post-classification recognition processes. Further editing and refining were conducted manually over the shapefile to ensure that the classification results are clean and only cover the forested areas.
2.6. Development of ACDI
The ACDI is a metric developed on the premise that there exists a direct correlation between the density of a forest’s canopy, or the amount of foliage and branches in its upper layers, and the quantity of carbon stored in the forest’s biomass. This relationship is rooted in the principle that a denser canopy typically implies a more extensive and carbon-rich vegetation structure. The ACDI is used to estimate the amount of carbon stored in a forest, which is important for evaluating forest carbon sink capacities. As such, the ACDI will serve as a valuable tool for estimating the amount of carbon sequestered in a forest ecosystem by analysing its AGC. The development of ACDI is based on the Forest Canopy Density (FCD) model that was established by [
44] and modified by [
45,
46,
47]. An inspection was conducted on this model and found that ambiguities exist at the grassland and the shrublands, especially burn scars areas where the FCD is found to have higher values than that of forested areas [
48]. This effect needs to be eliminated and the only solution to this is by suppressing the values to a level that is representative to the actual physical condition on the ground. Therefore, this model is further modified in this study and the ACDI is thus developed, which can be expressed as
where each image variable is summarised in
Table 5. The calculation was conducted by using Top of Atmosphere (ToA) reflectance values.
The vegetation indices used in the ACDI were chosen with care to highlight the forest areas, distinguish them from other features, and show how the forests vary under different circumstances. The Normalised Difference Vegetation Index (NDVI) is a widely-used metric for quantifying the health and density of vegetation using sensor data. The Shadow Index (SI) is used to derive information about various landscape phenomena, including vegetation health and land classifications. However, the specific purpose or application of the Shadow Index is for detecting and correcting for shadows in optical satellite imagery. On the other hand, the Normalised Burn Ratio (NBR) is a radiometric measure of burn severity that was originally developed using Landsat Thematic Mapper data. The NBR is a widely used index for monitoring environmental changes, particularly those related to fire intensity and burn severity.
The SAVI is a vegetation index that is designed to minimise the influence of soil brightness on the vegetation signal1. It is particularly useful in areas where vegetative cover is low. In contrast, the IO can be used to estimate the presence of iron oxide in various landscapes, such as wetlands. The ratio presented in IO is also used as a geological index used for identifying rock features that have experienced oxidation of iron-bearing sulphides. However, in this case the IO was included in the equation to differentiate forest cover especially in wetlands areas [
56]. On the other hand, the MNDWI is a spectral index used for several purposes, such as enhancement of open water features that is particularly useful in built-up areas as it can reduce or even remove built-up land. It is also used to analyse water bodies such as rivers, lakes, and dams. In this case, the MNDWI was included to diminish built-up area features that are often correlated with open water in other indices. Finally, EVI was included in the equation as one of the multiplicative indicators in the denominator. This "optimised" vegetation index aims to improve vegetation monitoring by decoupling the canopy background signal and minimising atmospheric impacts, hence increasing the vegetation signal’s sensitivity in high biomass regions. It thus enhanced the vegetation health and density of vegetation.
The ACDI equation was then applied to the Landsat-8 Operational Land Imaging (OLI) for the year 2023. This process is similar to the production of a seamless mosaic of Landsat images over Malaysia as described earlier. However, an additional step was applied to include the ACDI formula to the image. This process was also conducted on the GEE platform.
2.7. Development of AGC Estimation Models
The linear relationship between AGC and the ACDI is a fundamental connection in the assessment of carbon content in terrestrial ecosystems. AGC represents the total carbon stored in the aboveground biomass of trees. ACDI, on the other hand, is a metric used to express this carbon content relative to a unit of area, typically per hectare or square metre. The extraction process was conducted on the GEE platform where a specific program code was created to extract the ACDI values from Landsat data that match the date (or year) of the field inventory data. This is to ensure that the value of AGC is true at the specific time, because the forest can change over time.
The linear relationship between AGC and ACDI is straightforward: as the aboveground carbon content increases in a given area, the ACDI value for that area also increases proportionally. Simple linear regression is a statistical method used to estimate the relationship between two quantitative variables. It is preferred over other regression models to measure the strength of the relationship between AGC and ACDI. Simple linear regression is also preferred when only one independent variable, (i.e., ACDI) is available. In this case ACDI is the predictor for AGC, where the linear relationship between these two variables can be expressed as
where y denotes AGC, m is the slope, and x is the ACDI. Both x and y variables intercept at 0, which means that the line passes through the origin (0, 0) of the plane, where ACDI is 0 when AGC is 0 or no vegetation (cleared land and water bodies).
2.8. Models Validation
Some of the sample plots data were used separately for validation (
Table 1). The validation plots are those measurements that have been conducted recently in the year 2023 to match the AGC map that was produced for the year 2023. To check the accuracy of the estimates, root mean square error (RMSE) was calculated. In this case, the accuracy is a measure of the error between a derived/predicted AGC from the ACDI and the actual AGC measured on the ground. The calculation can be expressed as follows:
where RMSE is the root mean square error of the estimated AGC (± Mg C ha
−1),
AGCp and
AGCr are the predicted and reference AGC, respectively, and
n is the sample size (i.e., number of validation plots).
In additional to the RMSE, the accuracies of the estimates were also measured in terms symmetric mean absolute percentage error (SMAPE). SMAPE is a commonly used metric for measuring the percentage accuracy between forecasted and actual values. It is particularly used to assess the performance of a forecasting model, and it has a preference for symmetrical errors. The adjusted SMAPE values typically range from 0% to 100% [
57]. A lower SMAPE indicates a better forecast accuracy, while a higher SMAPE indicates a less accurate forecast. SMAPE is calculated as follows:
2.9. Thematic Map Production
The empirical equations that have been derived from the regression analysis were applied to the ACDI images. Each equation was applied to produce estimated AGC according to the forest types. Since the model produced is made according to the type of forest, each equation was applied three times, each for dry inland forest, peat swamp forest, and mangrove forest. Each resulting AGC image was then cropped to match the forest type. Then the three images were rejoined to produce a single image containing the AGC value according to the type of forest. The mosaiced product was a single-layer image with pixel values representing AGC at 30-m resolution. This image generated a wall-to-wall map of AGC throughout Malaysia. By using this map, AGC at any location can be determined and statistics of AGC within any polygon can be extracted.
4. Conclusions
Based on the estimates, a 30-metre resolution, wall-to-wall map of AGC across the entire forested region of Malaysia has been produced from a single Landsat satellite image. The ACDI was calibrated and validated by using a collection of 12 years inventory data. Forest types were divided into three classes which are dry inland, peat swamp and mangrove forests. The total AGC in all types of forests in Malaysia was estimated at 3.0 billion Mg C. The accuracy of the estimates was assessed and the attainable overall accuracy was at about 80%. The statistics AGC for all forest types were presented covering the entire regions of Malaysia. These estimates were also divided into categories and reported to the AGC at the state level. Image classification that was carried out to delineate the forest covers produced a map that revealed that the forest cover in Malaysia was at about 18 million ha in 2023. The averages AGC estimated for dry inland, peat swamp and mangrove forests are 171.45 ± 67.00 Mg C ha−1, 109.51 ± 60.78 Mg C ha−1, and 91.50 ± 76.18 Mg C ha−1, respectively. It was also found that the ACDI have different responses towards the AGC.
Landsat data have proven to be a valuable resource for forest biomass prediction, offering insights into forest ecosystems and their response to environmental changes. The combination of Landsat data with advanced modelling techniques, the use of cloud-based platforms such as GEE and other advanced technologies has enhanced the ability to estimate biomass accurately. As technology and methodologies continue to evolve, Landsat data will likely remain a pivotal tool in monitoring and managing forest resources in the context of climate change and environmental conservation. Further research is needed to address challenges, refine methodologies, and improve the accuracy of forest biomass predictions using Landsat data.
The scrutiny against carbon project in the international voluntary markets, in recent years, demand for more accuracy and rigorous assessment of data to (i) support evidence of additionality through documented forest loss or degradation; (ii) support robustness and quantification of GHG emission where data is use to estimates the deforestation or degradation rates at project, subnational and national level; (iii) assess non-permanence risks including site susceptibility to natural hazards; and (iv) support evidence of co benefits, where in some cases geospatial data is used for biodiversity profiles.
The use of remote sensing and GIS analysis allows nature-based carbon project developers to assess the feasibility of their projects in a more cost-effective way. The use of Landsat data will allow project developers to identify degraded areas and design the remedial measures more effectively. This study can be expanded for generation of time-series assessment over at least a 2-year interval [
71,
72]. This data will also facilitate the subsequent carbon verification process and ensures the validity and accountability of emissions data, the success of emissions reduction projects, confirming that the emissions reductions are permanent and genuine.
This study can potentially be used for the national/subnational mitigation efforts including the REDD+ implementation. REDD+ is constructed on the principles of additionality against a baseline or reference emission level (FRL/FREL), with no displacement of emissions to neighbouring areas (leakage). A consistent monitoring and reporting system that works across scales is therefore important for operationalizing REDD+, ensuring no displacement in the emission and also to avoid potential double counting issues. The generation of subnational/jurisdictional level FRL and FREL will enable the Government to develop more effective mitigation measures in achieving the Malaysian Nationally Determined Contribution and offer the potential to scale up emissions reductions more rapidly with greater environmental integrity. More than 73 countries have implemented their carbon pricing instrument, CPI (emission trading scheme and/or carbon tax) as a means of bringing down emissions and driving investment into cleaner options [
73]. The foundation of how allocation is determined under these instruments are based on historical intensity of emission from the targeted sectors. This study may potentially be used as a basis study to determine allocation for the forestry sector, if CPI is implemented in Malaysia.
Although the study has successfully provided estimates of AGC for the entire Malaysia, there are some limitations that are foreseen to have potentially be addressed in the future. Spatial resolution of Landsat data, which currently offers at 30-m resolution images can affect the accuracy of biomass predictions, particularly in heterogeneous landscapes. Integration with other data sources by combining Landsat data with other remote sensing platforms (e.g., LiDAR, SAR) can improve the accuracy of biomass predictions. Continuous calibration and validation of biomass prediction models are also crucial to ensure their accuracy and reliability and these processes are expected to become a requirement in the future, especially when dealing with carbon projects at a state- or project-level.
In conclusion, the availability of comprehensive inventory data is instrumental in unveiling the intricate correlation patterns between aboveground carbon levels and the image variables extracted from Landsat data [
74]. This symbiotic relationship between ground-based measurements and remote sensing imagery enables better comprehension of the dynamics of terrestrial carbon sequestration. With a wealth of inventory data at the disposal, more holistic understanding is gained of how various ecological and environmental factors influence aboveground carbon stocks. This knowledge not only enriches our understanding of our planet’s carbon balance but also empowers us to make informed decisions for sustainable land management and climate change mitigation.
Figure 2.
Layout of a cluster for inland forest.
Figure 2.
Layout of a cluster for inland forest.
Figure 3.
Layout of a sampling plot for inland forest.
Figure 3.
Layout of a sampling plot for inland forest.
Figure 5.
Layout of a sampling plot for peat swamp forests.
Figure 5.
Layout of a sampling plot for peat swamp forests.
Figure 6.
Layout of a cluster for mangroves.
Figure 6.
Layout of a cluster for mangroves.
Figure 7.
Layout of a sampling plot for mangrove forest.
Figure 7.
Layout of a sampling plot for mangrove forest.
Figure 9.
Boxplots summarising the sample plots data.
Figure 9.
Boxplots summarising the sample plots data.
Figure 10.
Seamless mosaic, cloud-free imageof Landsat over Malaysia of year 2023.
Figure 10.
Seamless mosaic, cloud-free imageof Landsat over Malaysia of year 2023.
Figure 11.
Histogram of ACDI distribution over Malaysia.
Figure 11.
Histogram of ACDI distribution over Malaysia.
Figure 12.
Map showing spatial distribution of ACDI over Malaysia, derived from the Landsat mosaic images.
Figure 12.
Map showing spatial distribution of ACDI over Malaysia, derived from the Landsat mosaic images.
Figure 13.
Scatterplots of correlations between AGC and ACDI for all forest types.
Figure 13.
Scatterplots of correlations between AGC and ACDI for all forest types.
Figure 14.
Histogram of AGC distribution over Malaysia.
Figure 14.
Histogram of AGC distribution over Malaysia.
Figure 15.
Map showing spatial distribution of AGC over Malaysia for the year 2023.
Figure 15.
Map showing spatial distribution of AGC over Malaysia for the year 2023.
Figure 16.
Summary of AGC in dry inland forest within all states in Malaysia.
Figure 16.
Summary of AGC in dry inland forest within all states in Malaysia.
Figure 17.
Summary of AGC in mangrove forest within particular states in Malaysia.
Figure 17.
Summary of AGC in mangrove forest within particular states in Malaysia.
Figure 18.
Summary of AGC in peat swamp forest within particular states in Malaysia.
Figure 18.
Summary of AGC in peat swamp forest within particular states in Malaysia.
Figure 19.
Map showing locations of the selected areas.
Figure 19.
Map showing locations of the selected areas.
Figure 20.
Maps showing spatial distribution of AGC over selected dry inland forest landscapes.
Figure 20.
Maps showing spatial distribution of AGC over selected dry inland forest landscapes.
Figure 21.
Maps showing spatial distribution of AGC over selected mangrove forest landscapes.
Figure 21.
Maps showing spatial distribution of AGC over selected mangrove forest landscapes.
Figure 22.
Maps showing spatial distribution of AGC over selected peat swamp forest landscapes.
Figure 22.
Maps showing spatial distribution of AGC over selected peat swamp forest landscapes.
Figure 23.
Validation scatterplots for the assessment of models’ performance.
Figure 23.
Validation scatterplots for the assessment of models’ performance.
Table 1.
Summary of the total number of sample plots.
Table 1.
Summary of the total number of sample plots.
Forest type |
No. of sample plots |
Total |
Data used for modelling |
Data used for validation |
Dry inland forest |
2,970 |
350 |
3,320 |
Peat swamp forest |
1,125 |
75 |
1,200 |
Mangrove forest |
1,750 |
50 |
1,800 |
Total |
5,845 |
475 |
6,320 |
Table 2.
Summary living trees measurement in a plot in inland forest.
Table 2.
Summary living trees measurement in a plot in inland forest.
Nest radius (m) |
Size |
Tree size, dbh (cm) |
2 |
Sapling |
< 5 cm (& ≥ 1.3 m in height) |
4 |
Small |
5 – 14.9 cm |
12 |
Medium |
15 – 29.9 cm |
20 |
Large |
≥ 30 cm |
Table 3.
Summary living trees measurement in a plot in peat swamp forests.
Table 3.
Summary living trees measurement in a plot in peat swamp forests.
Nest radius (m) |
Size |
Tree size, dbh (cm) |
2 |
Sapling |
< 5 cm (& ≥ 1.3 m in height) |
4 |
Small - Medium |
5 – 9.9 cm |
10 |
Large |
≥ 10 cm |
Table 4.
Summary living trees measurement in a plot in mangrove forest.
Table 4.
Summary living trees measurement in a plot in mangrove forest.
Nest radius (m) |
Size |
Tree size, dbh (cm) |
2 |
Sapling |
< 5 cm (& ≥ 1.3 m in height) |
7 |
Small - Large |
≥ 5 cm |
Table 5.
Image variables that were used to develop ACDI.
Table 5.
Image variables that were used to develop ACDI.
Image variable |
Full name |
Formula |
Reference |
NDVI |
Normalised Difference Vegetation Index |
[(NIR – R)/(NIR + R)] |
[49] |
NBR |
Normalised Burn Ratio |
[(NIR – SWIR)/(NIR + SWIR)] |
[50] |
SI |
Shadow Index |
[(1 – B) (1 – G) (1 – R)]1/3
|
[51] |
SAVI |
Soil-Adjusted Vegetation Index |
[(NIR – R)/(NIR+R+L)]*[1+L] |
[52] |
IO |
Iron Oxide Index |
R/B |
[53] |
MNDWI |
Modified Normalised Difference Water Index |
[(G – SWIR)/(G + SWIR)] |
[54] |
EVI |
Enhanced Vegetation Index |
GF× [(NIR – R)/(NIR + C1 × R – C2 × B + L) |
[55] |
Table 6.
Basic statistics of the sample plots data.
Table 6.
Basic statistics of the sample plots data.
Forest type |
No. of samples (n) |
AGC (Mg C ha-1) |
Min |
Lower quartile |
Median |
Mean |
Upper quartile |
Max |
Out-liers |
Inland Forest |
2,970 |
0.0 |
56.3 |
92.9 |
115.4 |
158.2 |
310.5 |
554.1 |
Peat Swamp Forest |
1,125 |
0.0 |
30.2 |
65.1 |
80.3 |
107.7 |
222.9 |
525.7 |
Mangrove Forest |
1,750 |
0.0 |
18.8 |
43.8 |
60.0 |
85.5 |
184.6 |
360.3 |
Table 7.
Extents of forests in Malaysia produced from image classification (2023).
Table 7.
Extents of forests in Malaysia produced from image classification (2023).
Forest type |
Extent (ha) |
Percentage (%) |
Dry inland forest |
16,859,417 |
93.3 |
Mangrove forest |
547,564 |
3.0 |
Peat swamp forest |
655,422 |
3.6 |
Total |
18,062,403 |
100.0 |
Table 8.
Basic statistics of ACDI values over Malaysia for the year 2023.
Table 8.
Basic statistics of ACDI values over Malaysia for the year 2023.
Min |
Max |
Mean |
Median |
Mode |
Std. Dev. |
0.00 |
198.18 |
25.34 |
22.46 |
19.36 |
14.77 |
Table 9.
Summary of AGC estimation models derived from the regression analysis.
Table 9.
Summary of AGC estimation models derived from the regression analysis.
Forest Type |
Empirical Equation* |
Correlation Coefficient (r2) |
Overall forest types |
AGC = 2.1187*ACDI |
0.4897 |
Dry inland forest |
AGC = 3.3763*ACDI |
0.6275 |
Peat swamp forest |
AGC = 2.3133*ACDI |
0.5787 |
Mangrove Forest |
AGC = 1.0815*ACDI |
0.6230 |
Table 10.
Basic statistics of AGC values (Mg C ha-1) throughout Malaysia for the year 2023.
Table 10.
Basic statistics of AGC values (Mg C ha-1) throughout Malaysia for the year 2023.
Min |
Max |
Mean |
Median |
Mode |
Std. Dev. |
0.00 |
448.79 |
126.72 |
151.35 |
59.83 |
61.98 |
Table 11.
Summary of AGC in all states in Malaysia for the year 2023.
Table 11.
Summary of AGC in all states in Malaysia for the year 2023.
Table 12.
Summary of AGC in all states in selected area, representing various conditions and types of forests in Malaysia.
Table 12.
Summary of AGC in all states in selected area, representing various conditions and types of forests in Malaysia.
Table 13.
Accuracies of the AGC predictions.
Table 13.
Accuracies of the AGC predictions.
Forest Type |
RMSE (±Mg C ha-1)
|
SMAPE (%)
|
Absolute accuracy (%) |
Overall performance |
Dry inland forest |
87.54 |
22.66 |
77.34 |
Underestimate |
Mangrove Forest |
53.15 |
22.86 |
77.14 |
Overestimate |
Peat swamp forest |
22.51 |
15.15 |
84.85 |
Underestimate |