An initial classification of urban vegetation is followed by a 3-D aggregation and an abstraction of individual tree crowns. Abstracted shapes finally enable compact information about the location, size, and spatial distribution of urban vegetation. Furthermore, we describe how each individual task has been realized.
2.1. Classification of the Urban Vegetation
The classification of urban vegetation works in two steps. Initially, outliers are detected and removed, and the remained points are separated into ground points and non-ground points. Subsequently, trees are classified by fusing the RGB components with the local maximum criteria [
49].
Outliers are present in LiDAR data due to external factors such as birds, suspended particles in the atmosphere, shiny metals, surfaces with high reflective properties, and others. Removing outliers is an important task to enable a faster overall processing time. Outliers can be detected and removed by analyzing the point’s neighborhood. For a given point, pi is calculated as the distance to its closest neighbors. The aim is to estimate the average distances between it and its sample standard deviation [
50].
By filtering ground points from the LiDAR data, the complexity of the urban scene is reduced. Many ground filtering approaches are available and are categorized as sloped-based methods [
51,
52], mathematical morphology-based methods [
53,
54,
55], and surface-based methods [
56,
57,
58,
59]. The sloped-based techniques are not robust to complex terrain, while the performance of the mathematical morphology-based methods depends on the designs of elaborate local operators. Surface-based methods can approximate the ground terrain with robustness and without tedious parameters. The ground points are removed from LiDAR data N, simulating the gravitational action of a piece of cloth C that covers an inverted N [
60]. The points are classified as ground points if the distance to C is less than a predefined threshold. Otherwise, it is classified as non-ground points Xi (
Figure 3b,d).
Complementary to the filtering task, an initial classification is executed in the point cloud to address the problem of false positives and false negatives of the vegetation objects. After filtering LiDAR data, the point cloud mainly contains trees, buildings, and smaller humanmade objects (
Figure 4a,c). Triangulated irregular networks (TIN) can be used to search for points with height values below 5 m and 30 m [
61]. It can also help assign values for non-ground points and null values for points erroneously filtered as non-ground (
Figure 4b,d).
The flatness and ruggedness of the objects are estimated using tolerance measures of 0.1 and 0.3, respectively. Vegetation objects are associated with higher reflectance values in the green G component, while higher reflectance values in the red R component are associated with building objects. Points labeled as vegetation, for which higher values are in the R and blue B components, can be considered false negatives of buildings and false positives of vegetation. The radiometric resolution of the corresponding RGB image enables important information to distinguish trees from buildings.
In this work, the ALS data was texturized using the orthoimage available. A prior radiometric transformation of the corresponding RGB orthoimage from 8 to 16 bits was done. Consequently, the range for the RGB coordinates was extended from {0;….;255} to {0;…; 65,535], corresponding to 216 levels. Therefore, points with low height variations concerning their neighbors were discarded as vegetation points by checking the existence of points with maximum height in all local neighborhoods using a search window of size 0.5 m [
62].
All local maximums are validated as vegetation; otherwise, they are treated as building false negatives. The criteria used for each labeled point Xi are summarized in
Table 1. In
Table 1, XiR, XiB, XiG represent the R, G, and B reflectance values for the point Xi, and b denotes the radiometric resolution of the image.
At the end of the process, a resulting point cloud U = {u1, u2, . . . , un} is obtained containing only true positives and false negatives of vegetation (
Figure 5b,d).
2.3. Proposed 3D Aggregation with Abstraction Purposes
The aim of our 3-D aggregation approach is to decrease the spatial density of the trees, maintaining their original structure with high computational efficiency. It corresponds to a 3-D aggregation approach based on Gestalt measures, such as the spatial relation of proximity and similarity, which, combined with a shape abstraction deep learning method, enables us to compact information about urban vegetation’s location, size, and spatial distribution. The goal was to optimize the merge between pairs of adjacent trees. It enables properly decreasing trees’ spatial density using our proposed Gestalt measures.
In this paper, the proposed 3-D aggregation operation was executed in two steps. Initially, we calculated the distances for all pairs of trees. Subsequently, the thresholds (i.e., height, length, and approximate area) were analyzed for the pairs that attend the proximity conditions. Thus, we computed a matrix in which the rows correspond to the trees resulting from the 3-D aggregation, and the columns are the recalculated attributes. Given an input Vi and Vj (e.g., pair of trees), the horizontal distance between their centroids d (Vi, Vj), the predefined threshold for the distance value between Vi and Vj (td), our goal is to estimate the cluster of the trees Gp that best describe the proximity criterion [d (Vi, Vj) ≤ td, (Vi, Vj) ∈ Gp].
We also used a similarity criterion for 3-D aggregation of the pairs of adjacent trees (Vi, Vj) ∈ Gs:
where: -

and

are the minimum and maximum threshold values, respectively, for the ratio between the relative tree heights (

,

);
- -

and

, e are the minimum and maximum threshold values for the ratio between the lengths of the trees in coordinates (

,

) and in Y coordinates (

,

);
- -

e

are the minimum and maximum threshold values for the ratio between the approximate areas of the trees (

,

).
Note that Gs is a cluster obtained from the similarity criterion. It is composed by pairs of adjacent trees with similar heights and approximate surface area, such that Gs ⊂ Gp. Interestingly, when all neighbors’ trees are identical, we have a particular case Gs ⊆ Gp. One of the key challenges when aggregating such objects is related to the vegetation cover area limits in the range of values between 5 m x 5 m (upper limit for LoD 3) and 50 m x 50 m (lower limit for LoD 1). Furthermore, wrong aggregations can occur derived from the preliminary suppression of the existing buildings.
To overcome this problem, we used different threshold values by assuming the LoD 2 point cloud CityGML 3.0 specifications. We have used two strategies to assign threshold values for Cx, Cy, and
Sapr components. First, we considered the similarity between the two horizontal spacing limits adopted for previous segmentation (e.g., 5 m and 7 m). Second, we used the value of 1.5 m as the upper limit for the ratio of superficial areas like [
42], whose approach focused on footprint areas of buildings to be aggregated. Once the LoD 2 has no specifications for height dimensions from point cloud objects, we adopted the following values:
e
for the ratio of relative heights. The minimum and maximum threshold values used for all parameters are presented in
Figure 7.
As results, we have a set of aggregated trees and a set of non-aggregated trees. Finding the structural attributes of each tree, we calculated important metrics to quantify the degree of segmentation of the existing tree set GF and the average internal distance GD between them, as follows:
where: - AD represents the number of trees delimited into area of study;
is the total surface area of an existing tree canopy;
, denotes the sum of all horizontal distances between adjacent trees;
is the total number of existing pair of adjacent trees.
Note that, GF is measured in trees/square meter of green area, and GD is measured in meters. The 3-D data aggregation reduces fragmentation RF due to the clustering and increases tree spreading ADA, as the smallest internal distances are suppressed in the clustering operation. Therefore, we calculated the variation of GF and GD using the expressions:
Where:
- GFg denotes the degree of fragmentation of the set of trees resulting from the 3-D aggregation;
- GFo is the degree of fragmentation of the set of trees before the 3-D aggregation;
- GDg represents the degree of dispersion of the set of trees resulting from the 3-D aggregation;
- GDo denotes the degree of dispersion of the set of trees before the 3-D aggregation;
- RF is the percent reduction in degree of fragmentation obtained after the 3-D aggregation, and ADA denotes the percentage increase in the degree of dispersion obtained after the 3-D aggregation.
Other challenges when storing, analyzing, and visualizing LiDAR data are related to memory requirements. However, without powered computers, one can still abstract the aggregated trees into a compact, structured point cloud. An abstraction task is needed to reduce memory requirements and computing times. Toward this goal, we adopted the deep leaning compression method [
48]. Learning a set of local feature descriptors [
43] and reconstructing the original aggregated trees from the embedding provides an abstracted point cloud, as shown in
Figure 8.