1. Introduction
Polygonal shape fitting is a key problem in computer vision and computer graphics with several applications including object recognition, computational cartography, signal summarization, and compression [
1,
2,
3,
4]. When curves are modeled, straight-line segments are usually preferred due to their simplicity. The polygonal fitting process saves memory space, reduces rendering time on graphics applications, and gives a more compact representation of the original shape.
According to the classical polygonal approximation problem of a 2D shape, the goal is to compute an N-vertex polygonal curve
P that approximates the boundary
B of the original shape
S according to a predefined error criterion. In the classical polygonal approximation, the vertices of
P are an ordered sub-sequence of the boundary points
B [
1]. The classical polygonal shape fitting provides poor results, especially when the number of vertices is low and the shape complexity is high. In this work, we study a general version of polygonal fitting called Unconstrained Polygonal Fitting (UPF) problem to provide better solutions with the same number of vertices. According to UPF, the N-vertices of
P can be placed anywhere in the 2D space. Therefore, the search space of UPF problem is a superset of the classical polygonal approximation (PA) problem, where the vertices are constrained to belong in the boundary of the given 2D shape. This means that the resulting solutions of UPF may better approximate the given curve than the PA problem solutions. This theoretically interesting computer vision problem has compact and easy-to-grasp description, but a very high algorithmic complexity due to the large search space. Even if just a triangle (
) is used, there does not exist any trivial method to compute the optimal solution of UPF problem.
Different error criteria have been proposed for shape fitting - polygonal approximation problems. In this research, the Intersection over Union (IoU) metric between S and P is maximized given a fixed number of polygonal vertices N, without any assumption or prior knowledge of the object structure. IoU metric can be easily replaced by any other segmentation metric, for example, accuracy and dice coefficient () without any change in the proposed methodology. Another problem instance, that we experimentally study is to define the problem under the Equal Area constraint which states that the area covered by the polygonal curve P should be equal to the area of the given shape S.
Figure 1 shows two example outputs of the classical polygonal approximation using the Douglas-Peucker algorithm [
5] (left) and the unconstrained polygonal fitting using the proposed method (right) for
vertices. In both cases, it holds that the solution of the UPF problem provides higher IoU, DICE values compared with the results of Douglas-Peucker algorithm. In the first example with an apple, the Douglas-Peucker algorithm yields
,
(
Figure 1a), while the proposed method yields
,
(
Figure 1b). In the second example with a building, the Douglas-Peucker algorithm yields
,
(
Figure 1c), while the proposed method yields
,
(
Figure 1d). There exists several examples where there exists significant differences in performance between the optimal solutions of the two problems, especially for lower values of
N where the optimal solutions of UPF space may be far from the shape boundary.
In summary, the major contributions of our work are the following: To the best of our knowledge, this is the first work to define and solve the UPF problem. The proposed method solves the UPF problem via Particle Swarm Optimization (PSO) [
6] that quickly explore the search space. The proposed framework suggests solutions that outperform baselines in quantitative terms, even if are not necessarily optimal due to the PSO. Especially in a low-dimensional search space, that is true for low number of vertices, the proposed method clearly outperforms baselines of classical PA, since even the optimal solutions of the classical PA problem usually fail to obtain high performance results due to the limited search space (shape boundary) of PA problem.
2. Related Work
The problem of shape fitting - polygonal approximation has been studied extensively during the last five decades [
1,
4,
5,
7]. The methods that have been developed solve the problem by approximating the original shape
S by a polygonal curve
P under the constraint that the
P vertex sequence is an ordered sub-sequence of the vertices along shape boundary. The polygonal approximation problem can be formulated in two ways [
1,
7]:
The problem of better fitting, where the approximation error is minimized or the fitting accuracy is maximized given the number of vertices N.
The problem of minimum number of vertices, where the approximation error or the fitting accuracy is bounded and the goal is to find the minimum number of vertices that satisfy the given bound.
The polygonal approximations algorithms can be classified as (1) optimal or non-optimal and (2) supervised or unsupervised algorithms [
7]. The non-optimal algorithms do not guarantee any kind of optimum as optimal algorithms but can find reasonable polygonal approximations faster than the optimal ones. The supervised algorithms take into account parameters to generate polygonal approximations, while the unsupervised algorithms generate the polygonal approximations without parameters. Yin [
8] presents a polygonal approximation approach based on the discrete particle swarm optimization (PSO) algorithm. Each particle represented as a binary vector corresponds to a candidate solution to the polygonal approximation problem. A swarm of particles is initiated and fly through the solution space for targeting the optimal solution. In [
9], the authors proposes a non-optimal and unsupervised algorithm for generation of polygonal approximations based on the convex hull of the 2D closed curves or contours. The significance levels of the contour points are computed using a symmetric version of the well-known Douglas-Peucker algorithm and, finally, a thresholding process is applied to obtain the vertices or dominant points of the polygonal approximation.
In the case of 2D shape fitting, a 2D binary image is given with foreground points representing the shape to be modeled. This image can be the result of any object detection or image segmentation method (e.g., [
10,
11,
12]). Several models can be used to solve the 2D shape fitting problem. In the literature, there exists several approaches that fit a set of ellipses to the given 2D shape based on (a) the Hough Transform,(b) Genetic and Optimization Algorithms and (c) edge-following [
12,
13]. In [
12], a framework has been proposed to represent a given 2D shape with an automatically determined number of ellipses based on expectation maximization criterion, so that the total area covered by the ellipses is equal to the area of the original shape. The method has been successfully applied to cell [
14] and tree detection [
15] problems. In [
4], an algorithm is proposed to extract and vectorize objects in images with low-complexity polygons based on local merging and splitting of cells. Departing from a polygonal partition that oversegments an image into convex cells, the algorithm refines the geometry of the partition while labeling its cells by a semantic class. The authors applied their method to a variety of scenes, from organic shapes to man-made objects through floor maps and line-drawing sketches.
Concerning the metrics that have been used to measure the performance on shape fitting problem can be classified into :
Boundary based metrics that compares the distance between the boundary of given shape and the approximated curve. Some of the most most metrics are the root mean square error (
) between the two curves and the maximum error between the boundary
S and their corresponding subcurves of
P. These metrics have the advantage that can also be applied on given curves (non closed contours) but are sensitive to boundary noise [
1,
7].
Region based metrics that compares the region of the approximated shape and the given shape. Region-based metrics are more tolerant to noise since they do not restrict themselves to the boundaries of shapes but rather take into account all shape points [
12]. Common metrics of this category are the Intersection over Union (IoU) and the Dice coefficient (DICE), which is defined by twice the area of the shapes’ intersection divided by the sum of the areas of each shape. Both of them have been also used to measure the performance of image segmentation methods [
16].
4. Experimental Evaluation
The evaluation of the proposed approach was based on two standard datasets from the literature. More specifically, we employ:
MPEG-7 [
19], which consists of 1,400 binary shapes organised in 70 categories with 20 shapes per category. This dataset has been extensively used in shape tasks [
12,
14,
20].
A subset of LEMS [
21], that is, 1,462 shapes that come from the following categories of the original database: Buildings, Containers, Fish, Fruit and vegetables, Misc Animal, People, Robots, Toddlers, and Turtles [
14].
Figure 2 shows twelve sample images form the MPEG-7 dataset and LEMS dataset.
We compared the proposed Unconstrained Polygonal Fitting based on Particle Swarm Optimization Algorithm (
UPF-PSO) and
UPF-PSO under equal area principle (
UPF-PSO-EA) methods
2 with the Douglas-Peucker algorithm (
DP) [
5] and Douglas-Peucker algorithm under equal area principle (
DP-EA) [
5]. Additionally, we have implemented a sequential baseline method of classical Polygonal Approximation that maximizes IoU. This method uses an initial polygonal curve that is given by the execution of the Douglas-Peucker algorithm (
DP) with high number of vertices (e.g.
). In each step, it sequentially removes the vertex of the polygonal curve so that the IoU of the resulting polygonal curve is maximized. The method terminates when the number of vertices are reduced to
N. This baseline method is called Sequential Maximization of IoU (
SM). After the execution of
SM, the equal area principle can be applied yielding the
SM-EA method.
In our experiments, we have evaluated the methods for different values of N. Since, our framework makes sense to be applied for low values of N, where the difference in performance between the solutions of UPF and PA usually is significant, we have fitted polygonal curves with (eight cases). According to the formulation of the UPF problem, we have compared the performance of UPF-PSO, UPF-PSO-EA, DP, DP-EA, SM and SM-EA on the IoU metric. For a given dataset, we also compute , where m is a method in {UPF-PSO, UPF-PSO-EA, DP, DP-EA, SM, SM-EA}. is quantified as the percentage of shapes of the datasets where the method m clearly outperforms the two others under the IoU. This also means that the value gives the percentage of images for which there is no clear winner method.
Table 1 and
Table 2 present the average IoU, DICE and Pr(m/IOU) metrics computed on all images from the MPEG7 and LEMS datasets, respectively. It holds that
UPF-PSO and
UPF-PSO-EA clearly outperform the rest methods in any dataset. UPF-PSO slightly outperforms
UPF-PSO-EA.
UPF-PSO or
UPF-PSO-EA outperform all the rest methods in about
of shapes under MPEG-7 and LEMS datasets.
SM or
SM-EA outperform all the rest methods in about
of shapes under MPEG-7 dataset and LEMS dataset.
DP or
DP-EA outperform all the rest methods in about
of shapes under MPEG-7 and LEMS datasets.
Figure 3 depicts the average IoU of
UPF-PSO,
UPF-PSO-EA,
DP,
DP-EA,
SM and
SM-EA computed for different values of
in the two datasets used.
UPF-PSO and
UPF-PSO-EA clearly outperform the rest methods under any dataset and
. As expected, the lower
N, the better outer-performance of the proposed methods compared to the classical PA methods. For high values of
N (
), the problem search space increases, so it is more difficult for the PSO to find near-optimal solutions. At the same time, when
N obtains high values, the solutions of classical PA significantly better approximate the given shape, reducing the gap from the UPF methods. Theoretically, it holds that as
N tends to infinity, the optimal solution of both problems (classical PA and UPF) converges to the initial shape having IoU equal to one (
). When equal area principle is applied it holds that the results are similar under any method. Usually, the equal area principle provides better results for low values of
N where the fitting polygon is less complex. The results under DICE metric depicted in
Figure 4 agree with the results of
Figure 3.
Figure 5 illustrates the
Pr(m/IoU),
{
UPF-PSO,
UPF-PSO-EA,
DP,
DP-EA,
SM,
SM-EA} over the 1,400 images of MPEG-7 dataset (
Figure 5(a)) the 1,462 of LEMS dataset
Figure 5(b). The results agree with the results of
Figure 3. If we sum the cases where
UPF-PSO or
UPF-PSO-EA outperforms all the rest methods, it holds that for
is more than
and
under MPEG-7 dataset and LEMS dataset. As it was expected, when
,
UPF-PSO or
UPF-PSO-EA clearly outperform the rest method on more than
of shapes and any dataset.
Finally,
Figure 6,
Figure 7 and
Figure 8 depict several output examples of the proposed methods and baselines to further study their behaviour. The DICE-IoU metrics and the used method-
N are depicted in the title and caption of each result, respectively. These figures depict results for
vertices on a turtle, bat and horse shapes from MPEG 7 and LEMS datasets. The results of
,
and
, corresponds to low, middle and high values of
N according to the proposed framework. In most of the cases the proposed methods outperform the baselines. As it was expected, the difference in performance between the proposed methods and baselines is higher for low value of N (
), due to the limited classical polygonal approximation problem - search space that is restricted on the shape boundary. PSO based methods yield high performance results, when it is more possible to find near-optimal solutions, which is true under low values of
N and less complex given shapes. On other cases (high values of
N and complex shapes), the PSO based methods may fail to provide high performance solutions due to the complicated and high dimensional search space. For example, in the case of
and horse shape (see
Figure 8) that is the most complex shape, the results of
UPF-PSO and
UPF-PSO-EA underperform the
SM method. However, it should be noticed that under this complex shape, when
or
it seems that the proposed methods outperform all baselines. Concerning, the other two simpler shapes, it holds that
UPF-PSO or
UPF-PSO-EA outperform any other method even when
, showing that the proposed methods are able to provide high performance results even for high values of
N.
Figure 1.
Example outputs of the proposed classical polygonal approximation (left) and unconstrained polygonal fitting (right) for vertices.
Figure 1.
Example outputs of the proposed classical polygonal approximation (left) and unconstrained polygonal fitting (right) for vertices.
Figure 2.
Twelve sample images form (a) the MPEG-7 dataset and (b) the LEMS dataset.
Figure 2.
Twelve sample images form (a) the MPEG-7 dataset and (b) the LEMS dataset.
Figure 3.
(a) The IoU metric of UPF-PSO, UPF-PSO-EA, DP, DP-EA, SM, SM-EA methods over (a) the 1,400 images of MPEG-7 dataset and (b) the 1,462 of LEMS dataset.
Figure 3.
(a) The IoU metric of UPF-PSO, UPF-PSO-EA, DP, DP-EA, SM, SM-EA methods over (a) the 1,400 images of MPEG-7 dataset and (b) the 1,462 of LEMS dataset.
Figure 4.
(a) The DICE metric of UPF-PSO, UPF-PSO-EA, DP, DP-EA, SM, SM-EA methods over (a) the 1,400 images of MPEG-7 dataset and (b) the 1,462 of LEMS dataset.
Figure 4.
(a) The DICE metric of UPF-PSO, UPF-PSO-EA, DP, DP-EA, SM, SM-EA methods over (a) the 1,400 images of MPEG-7 dataset and (b) the 1,462 of LEMS dataset.
Figure 5.
(a) The Pr(m/IoU), {UPF-PSO, UPF-PSO-EA, DP, DP-EA, SM, SM-EA} over the (a) 1,400 images of MPEG-7 dataset and (b) the 1,462 of LEMS dataset.
Figure 5.
(a) The Pr(m/IoU), {UPF-PSO, UPF-PSO-EA, DP, DP-EA, SM, SM-EA} over the (a) 1,400 images of MPEG-7 dataset and (b) the 1,462 of LEMS dataset.
Figure 6.
Results of the proposed methods and baselines for vertices on a bat shape from MPEG7 dataset.
Figure 6.
Results of the proposed methods and baselines for vertices on a bat shape from MPEG7 dataset.
Figure 7.
Results of the proposed methods and baselines for vertices on a turtle shape from LEMS dataset.
Figure 7.
Results of the proposed methods and baselines for vertices on a turtle shape from LEMS dataset.
Figure 8.
Results of the proposed methods and baselines for vertices on a horse shape from LEMS dataset.
Figure 8.
Results of the proposed methods and baselines for vertices on a horse shape from LEMS dataset.
Table 1.
The average IoU, DICE, Pr(m/IOU), UPF-PSO, UPF-PSO-EA, DP, DP-EA, SM, SM-EA methods computed on all images of MPEG7 dataset.
Table 1.
The average IoU, DICE, Pr(m/IOU), UPF-PSO, UPF-PSO-EA, DP, DP-EA, SM, SM-EA methods computed on all images of MPEG7 dataset.
Methods |
IoU |
DICE |
Pr(m/IOU) |
UPF-PSO |
0.776 |
0.866 |
0.477 |
UPF-PSO-EA |
0.770 |
0.861 |
0.322 |
DP |
0.645 |
0.764 |
0.005 |
DP-EA |
0.661 |
0.771 |
0.031 |
SM |
0.732 |
0.833 |
0.084 |
SM-EA |
0.731 |
0.831 |
0.080 |
Table 2.
The average IoU, DICE, Pr(m/IOU), of the methods UPF-PSO, UPF-PSO-EA, DP, DP-EA, SM, SM-EA methods computed on all images of LEMS dataset.
Table 2.
The average IoU, DICE, Pr(m/IOU), of the methods UPF-PSO, UPF-PSO-EA, DP, DP-EA, SM, SM-EA methods computed on all images of LEMS dataset.
Methods |
IoU |
DICE |
Pr(m/IOU) |
UPF-PSO |
0.801 |
0.885 |
0.404 |
UPF-PSO-EA |
0.803 |
0.886 |
0.382 |
DP |
0.679 |
0.793 |
0.008 |
DP-EA |
0.685 |
0.794 |
0.024 |
SM |
0.751 |
0.848 |
0.093 |
SM-EA |
0.751 |
0.847 |
0.089 |