1. Introduction
One of the new techniques developed for the analysis of large clusters of information, known as Big Data, is Topological Data Analysis (TDA). In TDA, simplicial complexes associated with the data are constructed. These structures include the Vietoris-Rips complex, the Čech complex, and the piecewise linear lower star complex, among others. Of special interest to us is the generalized Čech complex structure. Although the standard Čech complex is formed by intersecting a collection of disks with a fixed radius, the generalized version allows varying radii. This flexibility enables us to highlight specific data points by assigning or weighting them with larger and/or more rapidly expanding balls to them, while de-emphasizing others by using smaller and/or slower growing balls. This approach proves valuable for handling noisy data sets, offering an alternative to discarding data that may not meet a specific significance threshold [
1].
Understanding the patterns of intersections and the timing of intersections among a set of disks in
, each with potentially different radii, is a fundamental problem. This leads to the exploration of the generalized Čech complex structure, which captures the intersection information of these disks, regardless of their radii. Rescaling the radii by the same factor, we obtain a filtered generalized Čech complex, where the associated simplicial complexes evolve as the scale parameter varies. In particular, in [
2], algorithms are provided to calculate the generalized Čech complex in
, and [
3] presents an algorithm to determine the Čech scale for a collection of disks in the plane.
To establish the necessary foundation for our study,
Section 2 introduces crucial concepts and notation that will be used throughout the article and we focus on analyzing the intersection of a disk system in
. We start by investigating the intersection of two disks in
Section 2.1 and then expand our analysis to a system of
m disks in
Section 2.2. By applying Helly’s Theorem, we prove that it is sufficient to examine the intersection of all subsystems consisting of
disks in order to determine if the system has a nonempty intersection.
In
Section 3, we define Vietoris-Rips systems and Čech systems, together with presenting results regarding the Rips scale and the Čech scale, as well as their connections. In
Section 3.1, we present an algorithm that can determine whether the intersection of the system is empty or non-empty. This is achieved by exclusively computing the poles of subsystems of disks (or spheres). Finally, in
Section 3.2, we introduce the algorithm that computes an approximation to the Čech scale using the numerical bisection method.
Additionally, in
Section 4 we incorporate the concept of an minimal axis-aligned bounding box (AABB) into our methodology. An AABB is a rectangular parallelepiped whose faces are perpendicular to the basis vectors. These bounding boxes frequently arise in spatial subdivision problems, such as ray tracing [
4] and collision detection [
5]. In this paper, we study AABBs to enclose the intersection of a finite collection of disks. This approach proves valuable for discerning whether the collection intersects at a singular point or not. In this section, we also provide an algorithm for constructing the AABB of a disk system.
2. Intersection properties of sphere systems
Throughout this work, we will refer to a
d-disk system M, or simply a
disk system, as a finite collection of closed disks in
with positive and not necessarily equal radii, i.e.,
Moreover, in order to study the intersection properties of a disk system
M with the approach addressed in
Section 3 and
Section 4 of this work, we will conduct a study in this section of the intersection properties of the spheres corresponding to the boundaries of each disk in
M, which we call a
sphere system and denote by
,
where
∂ denotes the topological boundary operator.
Following the notation in [
6], we introduce the following generalization of the sphere.
Definition 1.
An i-sphere in is the intersection of a sphere with an affine subspace of dimension i.
Of course, the notions of a sphere (as a
-dimensional surface) and a d-sphere in
agree. However, an i-sphere in
can also be viewed as the intersection of d-spheres. For instance, the intersection of two spheres typically occurs in a hyperplane, forming a
-sphere in
. When another d-sphere intersects this configuration, the result may be a
-sphere, a
-sphere, a 0-sphere (a single point), or it might even be empty, all within the same hyperplane. For a disk system
composed of
m disks, where
is a set in general position in
, the maximum dimension of the affine subspace associated with the i-sphere, obtained from the intersection of all the spheres in
, is at most
, or equivalently,
. This conclusion is drawn from [
6, Theorem 2.1] and the fact that the affine hull of
is of dimension
. Consequently, the following result is proven.
Lemma 1.
Let be disk system such that is a set in general position in . Then, the possibilities for the set are:
the empty set;
a single point;
a -sphere.
Remarkable points in i-spheres that will play a key role in the rest of the article are the poles. Let be the canonical projection on the i-th factor for , and let be the standard basis of .
Definition 2.
Let be the q-th vector of the canonical base of . An -north (south) pole of an i-sphere S in is a point on S whose projection on the q-th coordinate is maximum (minimum). In other words, is the -north pole if for all , where represents the projection onto the q-th coordinate.
We denote the -north pole of S by and the -south pole by .
An i-sphere can have a single -pole (north or south) or an infinite number of them, which occurs when a normal vector to the affine space containing the i-sphere is aligned with the vector . We are interested in finding the -poles of -spheres originating from disk systems , by taking the intersection . Such -spheres will be denoted by , to emphasize the disk system M, as well as its center and radius.
Lemma 2.
Let be a d-disk system such that , and let p be a point in such that (resp. ) for every x in . Then, there exists an i-sphere such that p is in S and p is the -south pole (resp. -north pole) of S.
Proof. Since , then , due to the closedness of the sets , for , and .
On the other hand, since , there exist indices such that for any ; let be a maximal subset of indices such that if and only if . We claim that is the -south pole of .
In effect, let be an open neighborhood of p sufficiently small such that:
Every has as maximal set of indices a proper subset of ,
.
The first condition can be guaranteed by the finiteness of the disk system M, and the second condition is a consequence of the maximality of the set . Therefore, for every , which is equivalent to the fact that for every , in the case of i-spheres.
□
2.1. Sphere systems with two spheres
In the following two lemmas we provide the computations to determine the center, radius and poles for a -sphere given by the intersection of two disks in .
Lemma 3.
Let be a disk system with two d-disks such that is a -sphere with center c and radius r. Then,
where .
Proof. Let
be the hyperplane containing the
-sphere S, which is defined by the equation:
where
and
. Then the normal vector of the hyperplane
is given by
, and the center
c of
S is determined by the intersection point of the hyperplane
with the perpendicular line that passes through the center
of
. This line can be parameterized as
, such that
and
. We can compute the intersection point
of
and
, for any
, by substituting it in (
1),
And solving for
, we obtain that
. Hence, the center of S is given by:
Next, we will compute the radius r of S. This radius can be determined as the height
r of the triangle with base
formed by the points
,
, and a point on S. Thus, by the Heron’s formula we have that
where
correspond to the semi-perimeter. □
We can proceed now to compute the poles of the -sphere .
Lemma 4.
Let and be two d-disks such that is a -sphere with center c and radius r. Then, the -poles of S are , where
Proof. For simplicity, we translate the hyperplane
, which contains the
-sphere
S, as well as the sphere itself, to the origin; in such case, the corresponding equations are given by,
where
and
for
. In the case that
, the normal vector
of the hyperplane
is orthogonal to the basis vector
. Therefore, the
-poles of
S are
, which agree with the formulae of the lemma.
On the other hand, suppose that
. To find the
-poles of S, we will use the Lagrange multiplier method. Consider the following function:
subject to the restriction:
Let
be the Lagrange multiplier, we define
For any
, consider the following system of equations:
Then
Solving this system of equations for
, we obtain that,
Comparing the last expression for two indices
, we have that,
Finally, for
, we can use the last expression to substitute it in (
2) and obtain the desired result. □
2.2. Sphere systems with more than two spheres
Now, let us proceed with the explicit calculation of the coefficients for the center
c of the
-sphere
. We can achieve this by considering the disk system translated to
, denoted as
, and by defining the
-sphere
. This sphere is positioned at the intersection of hyperplanes (for more details, refer to [
6]).
for all
. Utilizing the information that the center of
can be expressed as a combination of the centers
and substituting it into (
3), we obtain a linear system of equations with dimensions
:
for
. Solving the system of equations for
, we find the center of
S as follows:
The radius of the sphere S can be computed using the equation:
for any
.
Now that we have determined the center and radius of S, as well as the affine space that contains it, we can proceed to compute its -poles for each . These poles reside in the affine space that contains S and within a set that we define below.
Let
S be an i-sphere in
, and let
be orthogonal vectors to the affine space L that contains S. Consider the space
M generated by these vectors together with the vector
from the canonical basis of
. Let us denote
for each
. Then, we can define
, the set L translated to the origin, as follows:
The set
M is defined as:
Refer to
Figure 1 for a visual representation of the subspaces L and
.
As mentioned above, the
-poles of
S lie at the intersection of L and
, where
c is the center of
S. To simplify the calculations, we will utilize
and
M, and then translate them into
c. The intersection of
M and
can be expressed as follows:
Let us consider a disk system in
, denoted
, where
. The intersection of their boundaries forms a
-sphere
S. In this case, the subspace
M has dimension
m, or
if
. We choose the normal vectors for the affine space containing
S as
, where
. Then
By rewriting, we have
If
is the
-sphere with center in c and radius r, then the
-poles of S are the
-poles of
but translated by c. The poles of
are located in
. If
p is an
-pole of
, then it can be expressed as
for some
,
and the following conditions holds:
for each
and
Thus, if
is an
-pole of
, the following equations are satisfied for
:
for all
, with
r the radius of the
-sphere S. From (
4) we have the system
Let us denote
A as the matrix
and
B as the vector
. Then, we have
, where
. Solving for
, we obtain
for each
(where
denotes the entry
j of the
vector
). By substituting the value of
into (
5), we obtain the quadratic equation:
Let us define
for all
. Solving this equation, we find:
for
. Therefore, the
-poles of
S, for
, are:
3. Vietoris-Rips and Čech systems
Our goal in this section is to provide a comprehensive understanding of the disk system, the Vietoris-Rips system, and the Čech system. Additionally, we introduce some results that establish a certain connection between both disk systems. Investigating the features and qualities of data and spaces can provide us with useful knowledge about their geometric and topological characteristics.
Before we look into the definitions of Vietoris-Rips and Čech systems, let us give a brief overview. These systems are essential in the field of topological data analysis for recognizing and comprehending the geometric structure of point cloud data. Vietoris-Rips complex and Čech complex share the goal of capturing the topology of the underlying metric space, both provide different ways of recognizing connections and associations among data points. The Vietoris-Rips complex tends to be more efficient and scalable for large datasets, while the Čech complex can be more accurate but computationally more expensive. The choice between the two depends on the nature of the dataset and the specific goals of the topological analysis. Now, let us move on to defining these fundamental concepts.
Definition 3.
Let be a d-disk system. We say M is a Vietoris-Rips system if for each pair . Furthermore, if the d-disk system M has the nonempty intersection property , then M is called a Čech system.
For each , we define a collection of d-disks with the same centers as those in the d-disk system M, but with radii rescaled by . When , is a d-disk system again. is equal to M, and is the set of the centers of the d-disks in M.
In the field of topological data analysis, the Rips scale and the Čech scale are essential parameters for determining the closeness and connectivity between data points. These two scales offer different perspectives on how we measure and comprehend geometric relationships within point-cloud data. To understand their importance in capturing the underlying topological structure, let us look at their definitions. The Vietoris-Rips scale
, of a d-disk system
M is the smallest
such that
is a Vietoris-Rips system. Similarly, the Čech scale
, of
M is the smallest
such that
is a Čech system. This is
Next, we present some easily observable properties for both scales. It can be easily seen that M is a Vietoris-Rips system if and only if (in particular ); similarly, M is a Čech system if and only if .
Note that for a given d-disk system
the Vietoris-Rips scale is
where
and
are the center and radii of
. An additional observation is that, in cases where the disk system consists of either one or two disks, the Vietoris-Rips scale coincides with the Čech scale. It is evident that every Čech system is also a Vietoris-Rips system; however, the reverse assertion, in general, is not true.
Conversely, if the d-disk system contains at least three disks, determining the Čech scale becomes more complex. In the context of Čech scale, the following remark is important and play a key role in implementation (see [
3] for details).
Remark 1.
If is the Čech scale for M, then the -rescaled system , has only one point in the intersection .
As we have mentioned, a Čech system is also a Vietoris-Rips system, but the converse is not true. What we can affirm is that if a system is a Vietoris-Rips system, then the system rescaled by the factor
is also a Čech system. This is established by the following lemma, the proof of which can be found in [
3].
Lemma 5.
Let be a d-disk system in euclidean space . If for every pair of disks in M, then
One of the implications of the previous result is that, for any given disk system M, we can bound the Čech scale using the Vietoris-Rips scale . This is stated by the following corollary.
Corollary 1.
If M is an arbitrary d-disk system and is its Vietoris-Rips scale, then its Čech scale satisfies . Therefore, for every d-disk system M, the rescaled disk system is always a Čech system. In particular, if then is a Čech system.
3.1. Algorithm for determining Čech system.
In the previous section, we have determined the -poles for the intersection of any number of disks in . If any of these poles is in all disks of the disk system, it indicates that the system conforms to the criteria of a Čech system. It is important to recognize that this result streamlines our calculation process, focusing on specific points to establish whether the system exhibits a non-empty intersection.
Given a system of m disks in where , it is enough to verify if every subsystem of disks qualifies as a Čech system to conclude that the entire system of disks has a non-empty intersection. This assertion is supported by the Helly’s Theorem.
Now, we introduce an algorithm that determines whether a disk system qualifies as a Čech system. In simpler terms, if the disk system exhibits a non-empty intersection, the algorithm outputs "TRUE"; otherwise, it outputs "FALSE". The algorithm operates by seeking poles within the intersections of the disk boundaries, which, as we have observed, correspond to i-spheres. It initiates the search for poles within individual disks and then progresses to the pairwise intersections of the disk boundaries (), continuing the process iteratively. If a pole is found within the remaining disks, the system is classified as a Čech system.
Theorem 1.
Let be a d-disk system. Then, M is a Čech system if and only if Cech.system TRUE.
Proof. If Cech.system TRUE, then the Cech.system algorithm (Algorithm 1) found a pole contained in the intersection , therefore and it follows that M is a Čech system.
On the other hand, if , let p be a point in satisfying for every x in . By Lemma 2, it follows that p belongs to an i-sphere and must be an -south pole. Therefore, by the exhaustive search of Algorithm 1 across all poles, its output is Cech.system TRUE. □
Algorithm 1: Cech.system
|
|
3.2. Algorithm to compute the Čech scale
Finding the minimum parameter for which the rescaled system of disks has a non-empty intersection is significant because it helps identify a critical threshold at which the disks come into contact. This parameter, known as the Čech scale, provides valuable information about the proximity or overlap of the disks, which can be crucial in various applications such as collision detection in computer graphics, spatial packing problems, and modeling physical phenomena. In this section, we introduce an algorithm to compute an approximation of the Čech scale for a system of
m disks in
.
Algorithm 2: Cech.scale
|
|
The given code presents an algorithm to compute an approximation of the Čech scale of a disk system in Euclidean space using Algorithm 1 and a precision parameter . It initializes the scale factor to the Rips scale . If this scale satisfies Cech.system TRUE, it indicates that the Čech scale has been found. Otherwise, we initiate a cycle in which we compute Cech.system of the system rescaled by a factor . The Čech scale is known to fall between the Rips scale and the value (Generalized Vietoris-Rips, Corollary 1). To approximate the Čech scale, we employ the bisection method as long as the interval enclosing the Čech scale has a length greater than . Finally, the algorithm returns an approximation of the Čech scale.
Utilizing the previously described algorithm, we can construct the filtered generalized Čech complex for a disk system
M. Let
denote the set of Čech subsystems, and
the set of Čech subsystems for the rescaled disk system
. The Čech filtration of the
M system forms a maximal chain of Čech complexes
where each
represents the Čech scale of the system
. Since the Čech scale of a disk system indicates the factor by which we must rescale the system to make it Čech, defining a level of the filtration,
, simply requires determining the Čech scale of the system
.
4. Minimal Axis-Aligned Bounding Box.
In this section, we introduce the concept of the minimal axis-aligned bounding box (AABB) for the intersection of d-disks and present methods for its computation. The AABB provides a simplified representation of the disk intersection, making it easier to obtain valuable information about the disks. This information could be useful for computing the Čech scale of a disk system.
Definition 4.
Let M be a disk system in . The minimal axis-aligned bounding box of M, denoted as , is defined as the smallest axis-aligned bounding box that contains the intersection , given by
where ranges over all axis-aligned bounding boxes that contain D.
Note that the AABB can be expressed as
where
is the canonical projection onto the
k-th factor, and
denotes the boundary of
D. In other words, the AABB of a disk system
M is given by the Cartesian product of intervals, where each interval is determined by the minimum and maximum values of the corresponding projection of the disk boundaries.
4.1. Minimal axis-aligned bounding box for two disks
Let’s consider the situation when the disk system
M is composed of two disks
and
in
. If
and
, the subset
can take one of three forms: an empty set, a single common point (when the disks are tangent), or a
-dimensional sphere. In the last case, we denote the
-dimensional sphere or the
-sphere
by
. To calculate
and
for each
, we can use:
Indeed, by Lemma 2, the extremes of the AABB are the projections of certain poles, either from the -sphere or some d-sphere. The d-spheres represent the boundaries of each disk, with poles given by , and the -sphere is , whose poles are computed using Lemma 4. It is worth noting that there are no further options for -spheres in the case of a two-disk system.
To simplify the notation, we will use
to denote the AABB of the intersection of disks
and
.
Figure 2 illustrates the AABB of the intersection of two disks in the plane.
According to (
6) and Lemma 3, we have a method to calculate the axis-aligned boundary box (AABB) for systems of two disks.
Knowing how to compute the AABB of two disks is not sufficient to determine the AABB for a disk system with more than two disks in . In the following examples, we demonstrate that the AABB of a disk system is not simply the intersection of all AABBs of two disks in .
Example 1. Let
be a Vietoris-Rips system in
with the following projection onto the
-plane:
By computing the boxes
for all
, we obtain:
Therefore,
. However, the disk system intersects at the point
, which means
. In other words,
is not equal to
.
We know that if , then . However, the converse is not always true. The following example illustrates this fact.
Example 2. Let
be a Vietoris-Rips system. The projection of disks
and
onto the
-plane is illustrated in
Figure 4.
We will now compute the intersections
for different pairs of disks:
The intersection of all pairwise intersections, , is given by . However, the intersection of disks and is a single point P, which is not contained in (by construction). Therefore, D is an empty set, but is not.
These examples clearly illustrate that when dealing with the AABB of three disks or more, knowing the AABB for pairs of disks is insufficient. In Example
Section 4.1, we observe that the intersection of
is not equal to the AABB of the disk system
M. Similarly, in Example
Section 4.1, we find that the intersection of three disks is empty, yet the intersection of
contains points.
Therefore, the next crucial step is to determine how to calculate the AABB of a disk system consisting of more than two disks in .
4.2. Minimal axis-aligned bounding box for more than two disks
Given a system M consisting of m disks in , we can compute the -poles for any subcollection of disks. Using these -poles, we can determine the axis-aligned bounding box (AABB) of M.
If , we calculate the -poles of the -sphere , and with these poles, we define the AABB of M by taking , where p is the -south pole of the -sphere (similarly for ).
In the case where , we consider as a collection of minimal axis-aligned boxes for the disk system in . If the intersection of any of these sets is nonempty, then the intersection of the entire collection gives us the minimal axis-aligned box of the disk system M. This can be expressed as . The next theorem confirms this finding.
Theorem 2 (Helly’s theorem for minimal axis-aligned boxes).
Let M be a Rips system with disks in . Then, the minimal axis-aligned bounding box for the intersection set satisfies:
where .
Proof. We know that
. Now, our objective is to establish the reverse inclusion, that is,
. In order to derive a contradiction, suppose that the reverse inclusion is not true. By definition, we have:
and
Without loss of generality, let’s assume that:
where
satisfies
.
Let be the point in such that for each . Note that because is not in D. Now, let be the line segment that connects and p.
Choose
small enough such that the hyperplane
does not contain any
, and
for all
(such hyperplane
P exists because
for all
). Since
is in
, the hyperplane
P intersects every disk in
for each
j, and
intersects
P at a point that is in
(see
Figure 5). Furthermore,
is a
-dimensional disk. Therefore, we have a collection
of
disks, each of dimension
, such that every subset
A of
consisting of
d disks has the non-empty intersection property. By Helly’s Theorem, the intersection of all
-disks in
is not empty. Therefore, there exists a point
with
. However, this contradicts the fact that
is such that
.
Therefore, we conclude that .
□
Lemma 6.
Let M be a Vietoris-Rips system of disks in . If and for each , then consists only of intervals of the form with (inverted intervals).
Proof. Suppose that
contains an interval. Without loss of generality, let’s assume it is the interval
for some
where
. By definition of
, we have
for all
and
for all
. Now, consider a hyperplane
with
. The hyperplane
P intersects
for every
(since
P cuts through all the boxes
). We also know that
is a
-dimensional disk for each
i.
Therefore, we have a collection of d disks in a dimensional space, and by Helly’s Theorem, this collection must have a non-empty intersection. This implies that , which contradicts our assumption that D is empty. Hence, all intervals in must be inverted intervals of the form with . □
As an example, we provide the lower bounds of the AABB for a system of three disks in
, this is, we compute
for each
(analogous
) with
. Let
be a Rips system in
, then, for each
, the computation of the AABB for the disk system
M is given by:
where (*) denotes the case where
with
such that
The preceding calculations are a result of Lemma 2. It is known that the extremes of the AABB lie in the projections of specific poles, either from a d-sphere, -sphere, or the -sphere.
Given a disk system M in , we can compute the minimal axis-aligned bounding box (AABB) for the intersection of all disks in M. If the AABB is a point, then it represents the intersection of the disks. This property allows us to identify when the AABB is a point. If the AABB of M is not a point, we can rescale M by a scale factor such that the AABB of becomes a point. The value of is referred to as the Čech scale of the system M.
4.3. Minimal axis-aligned bounding box Algorithm
Now, we present an algorithm to calculate the minimal axis-aligned bounding box (AABB) of a system of m disks in . As previously explained, the AABB’s extremes are defined by projecting the poles of specific i-spheres. Thus, in computing the AABB, we will determine the poles for each i-sphere within the disk system.
The algorithm starts by initializing the set of poles, P, as an empty set. In each iteration of the first loop, we determine the spheres formed by the intersection of the boundaries of the disk subcollections in the system
M (for this, we require the center and radius, which are computed in
Section 2.2). Next, we identify the poles of each sphere and if any of them are present in all the disks of
M, we add them to the set P. If P is not empty, we proceed to calculate the extremes for each dimension of the AABB using the set of north poles for the upper bounds and the set of south poles for the lower bounds. In this case, the output is the product of the intervals defined by the computed extremes. If P is empty, it indicates that there is no intersection in the disk system.
Algorithm 3: AABB.minimal
|
|