Preprint
Article

Clustering Coefficient of Lexicographic Products

Altmetrics

Downloads

62

Views

13

Comments

0

This version is not peer-reviewed

Submitted:

10 May 2024

Posted:

10 May 2024

You are already at the latest version

Alerts
Abstract
Clustering coefficient measures are key complex network analysis tools. We examine local and global clustering coefficient measures with respect to the lexicographic graph product. As a preliminary condition, we analyze the $K_3$ subgraph structure focused on vertex inclusion with respect to the product graph. From this structure, we determine both the clustering coefficient for the product graph vertices and the average clustering coefficient for the product graph.
Keywords: 
Subject: Computer Science and Mathematics  -   Discrete Mathematics and Combinatorics

MSC:  primary 05C82; secondary 05C75

1. Introduction

Clustering coefficient is a widely used tool in complex network analysis, particularly in social and neural networks. Introduced by Holland and Leinhardt (1971) [7], it is Strogatz and Watts (1998) who are credited with popularizing the average clustering coefficient of a graph as the small world property of a network [14]. Network analysis is a key area of study in applied graph theory that includes lexicographic networks (an example paper is [11].)
As mentioned, the various clustering coefficients are utilized in aspects of network structure analysis, especially with respect to community structure and information flow. A network vertex group with high local clustering coefficients indicates an area of high connectivity thus wide information exchange, while low clustering reflects structural holes. In information networks, the holes can indicate the presence of “gatekeepers" who control the information flow. Mark Newman’s Networks,Newman covers the various clustering coefficient measures in some detail. Although the number of triangles in a graph can be determined in polynomial time, this calculation relays little information about the relationship of the clustering in the lexicographic product as it relates to factor graph structure. In this note, we take a mathematical approach to the clustering coefficient so analysis of network structure is beyond the scope of this note. However, the information given here aids in network model construction and clustering investigation.
In this paper, the clustering coefficient calculations, and surrounding discussion, convey how the resulting community structure in the product graph is formed from the factor graph clustering. Theorem 1 gives the number of triangles in which a lexicographic product graph vertex can be found provided that the factor graphs are finite and simple. Theorem 2 provides the clustering coefficient of a lexicographic product graph vertex, while Corollary 3 presents the average clustering coefficient for this product graph.
As far as we know, there are only two other papers, [3,4], addressing clustering coefficient measures with respect to product graphs; and neither paper considers the lexicographic product. Calculating the clustering coefficient of any vertex, and the average clustering coefficient of the graph containing that vertex, provides little information regarding vertex K 3 subgraph inclusion structure for the chosen graph. Several papers have been written regarding the minimum cycle basis structure of the lexicographic product graph ([1,8,9,10] as stated in [9]); but none of these papers address vertex K 3 subgraph inclusion that is given in this note.
In Section 2 we provide notation and background information regarding the lexicographic graph product and regarding clustering coefficient measures. Section 3 contains the equation for the number of K 3 subgraphs in which a product graph vertex can be found. The determination of the clustering coefficient for vertices, and the more global average clustering coefficient, are found in Section 4.

2. Background

Define a finite graph G by its vertex set V ( G ) and its edge set E ( G ) ; and let | V ( G ) | and | E ( G ) | denote the number of elements in these two sets respectively. We assume general graph knowledge as found in [2]. Here, all graphs are finite and simple, thus disallowing both multiedges or loops. We give all graphs the same vertex labeling of { 0 , 1 , , n 1 } where graph order is n = | V ( G ) | . Graph size is denoted with m = | E ( G ) | . When reference is given concerning a specific graph G, then order and size are given by n G and m G , respectively. The open neighborhood of vertex x is N ( x ) while the closed neighborhood is N [ x ] . For graph order n, path graphs are P n , complete graphs are K n and empty graphs with n isolated vertices are denoted by D n .

2.1. Lexicographic Graph Product

Graph product operations act on two graphs G and H referred to as the factor graphs. In graph theory, a homomorphism ϕ : G H is a vertex set mapping such that { ϕ ( x ) , ϕ ( y ) } E ( H ) implies that edge { x , y } is in E ( G ) where x , y V ( G ) ; thus preserving adjacency. A weak homomorphism is vertex map ψ : G H where an edge { x , y } in G implies either { ψ ( x ) , ψ ( y ) } E ( H ) or ψ ( x ) = ψ ( y ) in H. For additional general information regarding graph products see [5].
The lexicographic graph product has vertex set of ordered pairs ( g , h ) produced by the Cartesian product V ( G ) × V ( H ) , where g V ( G ) , h V ( H ) . Pair { ( g , h ) , ( g h ) } is an edge in G H if either { g , g } E ( G ) or g = g and { h , h } E ( H ) . Note that the operator g = g and { h , h } E ( H ) is a weak homomorphism. K 1 is the unit for this product.
We sometimes indicate vertices in G H by g h instead of ( g , h ) as shown in Figure 1 displaying P 2 P 3 and P 3 P 2 . In this figure, operator { g , g } E ( G ) is in dashed edges and operator g = g and { h , h } E ( H ) is in solid. As with other product graphs, G H has projections from the product graph to its factors (see [5] for more information.) This results in the existence of “copies" of the two factors in the product graph. In Figure 1, note that the solid edges in the product are copies of H referred to as H layers; and some of the dashed lines are copies of G, analogously called G layers.
We refer to the edges generated by operator { g , g } E ( G ) as “spider edges". Notice that the spider edges join two adjacent H layers. Based on the definition for this product, G H has order of n G · n H and size n G ( m H ) + m G ( n H 2 ) . When referencing a vertex in G, we use g G ; and similarly, a vertex in H is h H unless otherwise noted. The degree of a vertex in one of the factors is indicated by deg ( g G ) (or deg ( h H ) ) where g G (or h H ) is the specifric vertex in G (or in H).
For a specific vertex x : = ( g , h ) , denote the H layers that contain x as H x , and let H indicate the H layers that do not contain x but do contain the vertices in N ( x ) .
The lexicographic product is associative but, with a few exceptions, it is not commutative. Displaying both P 2 P 3 and P 3 P 2 , Figure 1 shows this product aspect. Although the graph order for both of these product graphs is the same, P 2 P 3 has 13 edges while the size of P 3 P 2 is 11.
When G is connected, then G H is connected. But when G is disconnected the product is disconnected with the number of connected components dependent on the number of components in G and in H.

2.2. Clustering Coefficient and Average Clustering Coefficient

The phrase “clustering coefficient" needs clarification as its usage varies in the literature. In this note, clustering coefficient refers specifically to the commonly utilized ratio focused on each vertex as calculated in Sage. To emphasize a vertex focus, we sometimes refer to the clustering coefficient as the vertex clustering coefficient. Given a specific vertex x, the clustering coefficient of x is:
c c ( x ) = total number of triangles that include x maximum number of triangles that could include x
The average clustering coefficient of a graph G, a c c ( G ) , is a global measure for G that can be compared to the density of K 3 subgraphs in G. Utilizing the vertex clustering coefficient c c ( x ) over all x V ( G ) , average clustering coefficient is:
a c c ( G ) = i = 1 n c c ( x i ) n
Based on vertex degree, the denominator of c c ( x ) is deg ( x ) 2 . As a ratio, clustering coefficient has a maximum of 1 and a minimum of 0. Bipartite graphs have a c c ( G ) = 0 as they are triangle free; while K n with n > 2 have c c ( x ) = a c c ( G ) = 1 . Since K 1 is the unit for the lexicographic product, a c c ( K 1 G ) = a c c ( G K 1 ) = a c c ( G ) . In this note, we interchange the use of “ K 3 subgraph" with “triangle". For additional information regarding clustering coefficient measures, see Newman [12].

3. Preliminaries

This section covers two topics: vertex degree and K 3 subgraph inclusion structure of the product graph vertices. Both of these topics are essential to the next section that discusses the two product graph clustering coefficient measures.

3.1. Vertex Degree

The denominator of c c ( ( g , h ) ) depends on the degree of a particular ( g , h ) in G H where deg ( g , h ) = deg ( h H ) + deg ( g G ) ( n H ) for any ( g , h ) .

3.2. Vertex Inclusion in Triangle Subgraph Structure

This note has much focus on determining the triangle inclusion structure with respect to any vertex in a lexicographic product graph. Let k 3 ( x ) be the K 3 inclusion number of a vertex x V ( G ) where k 3 ( x ) is defined as the total number of K 3 subgraphs in G that include x.
Let n 1 and n 2 be two graph orders that may, or may not, be equal. Given two complete graphs K n , we know that K n 1 K n 2 produces the complete graph K n 1 n 2 where the triangle structure is already known. As empty graph D n has no edges, then it follows that D n 1 D n 2 also has no K 3 subgraphs.
Figure 2 displays, on the left, the incident edges for only vertex 11 in G H where G is in gray and disconnected H is in solid. As shown in this figure, the concept of the four K 3 subgraph partitions is utilized in the proof of Theorem 1. Dotted edges represent spider edges in N ( 11 ) while dashed edges are spider edges for vertex 11.
In Figure 2, there are four edge partitions. The partition in the upper right reflects K 3 subgraphs formed from a single H edge and two of 11’s spider edges. The lower left partition shows the triangle in H 11 formed from two H 11 edges and one H 11 edge from N ( 11 ) . This K 3 is not included in any of the other partitions. The bottom middle partition shows K 3 subgraphs formed from one H 11 edge, one 11 spider edge and one spider edge from N ( 11 ) . For greater clarity, this partition only displays two vertices in N ( 11 ) . Notice that as H is disconnected, spider edges from vertices 03 and 33 map to 10; but an edge from 11 to 10 does not exist so there is no triangle here. Lastly, the bottom right partition displays triangles that contain a single 11 spider edge and two spider edges for 20 N ( 11 ) . As mentioned in the previous paragraph, the proof of Theorem 1 refers to this figure; thus providing additional explanation.
 Theorem 1.
Let G and H be connected simple graphs. Then the number of  K 3  subgraphs in  G H that include specific vertex ( g , h ) is:
k 3 ( g , h ) = deg ( g G ) m H + deg ( h H ) · n H + k 3 ( h H ) + k 3 ( g G ) · ( n H 2 )
 Proof. 
Suppose G H has finite and simple factor graphs G and H. Let ( g , h ) be any vertex in G H . There exist three types of edges with respect to ( g , h ) : the set of H layer edges ( H edges plus H ( g , h ) edges incident to ( g , h ) ), the set of spider edges incident to ( g , h ) and the spider edges of vertices in N ( ( g , h ) ) that are not incident to ( g , h ) but are incident to other members of N ( ( g , h ) ) . After addressing disconnected factor graphs, we divide this proof into sections based on K 3 subgraph partitions determined by the three edge types as shown in Figure 2.
Given a disconnected G H , k 3 ( ( g , h ) ) is exclusive to any particular ( g , h ) , whether G H is connected or not. In other words, k 3 ( ( g , h ) ) is relative to each vertex in each connected component. If H is disconnected, then the number of possible triangles in which ( g , h ) is located is reduced by the absence of an edge, or edges, in each H layer. This is similar for a disconnected G that generates a disconnected G H .
(1) One H edge, and two ( g , h ) spider edges:
As operator { g , g } E ( G ) generates edges between adjacent H layers for any ( g , h ) , follow a spider edge s of ( g , h ) to a neighbor incident to an edge e in H . Then there exists another neighbor of ( g , h ) incident to e that is incident to another spider edge s of ( g , h ) . Denote this path by these edges. There are deg ( g G ) number of ( g , h ) ’s neighbors; and for these neighbors, one can find deg ( g G ) · m H number of distinct ( s , e , s ) paths in this partition. This gives deg ( g G ) · m H of this triangle type that contain ( g , h ) .
(2) Two H ( g , h ) edges and one H ( g , h ) edge from N ( ( g , h ) ) :
For ( g , h ) in the product graph, let h H be the vertex in H that is in ( g , h ) ; and suppose that h H shares a triangle in H with vertices y and z. Then ( g , h ) shares a triangle in G H with ( g , y ) and ( g , z ) . In other words, ( g , h ) , ( g , y ) and ( g , z ) are all in the same H ( g , h ) layer; and they are contained in a K 3 subgraph in that layer. The vertex set { ( g , h ) , ( g , y ) , ( g , z ) } triangle consists of two H ( g , h ) edges incident to ( g , h ) and the H ( g , h ) edge { ( g , y ) , ( g , z ) } . Thus this triangle is not counted by the first set of partitions. This holds for any number of triangles in H that contain h H , and k 3 ( h H ) counts all of them by its definition.
(3) One H ( g , h ) edge, one ( g , h ) spider edge, one N ( ( g , h ) ) spider edge:
For ( g , h ) , let y be a neighbor of ( g , h ) in H ( g , h ) . For all vertices z in N ( ( g , h ) ) that are also adjacent to y due to operator { g , g } E ( G ) , there exist ( ( g , h ) , y , z , ( g , h ) ) paths. As these paths involve one H ( g , h ) edge, then the number of these paths for each such neighbor depends on deg ( h H ) and on n H . The total number of these triangles is determined by deg ( g G ) resulting in deg ( g G ) · deg ( h H ) · n H of these K 3 subgraphs that distinctly include ( g , h ) .
(4) Two ( g , h ) spider edges and one N ( ( g , h ) ) spider edge:
Now let g G , y and z be vertices in G where set { g G , y , z } forms a triangle in G and g G is in ( g , h ) . Then there exists a K 3 subgraph in G H that contains vertices ( g , h ) , ( y , h ) and ( z , h ) . Thus, vertex ( g , h ) has spider edges to not only ( y , h ) and ( z , h ) but also to all vertices in layers H y and H z . For any ( g , h ) spider edge to a neighbor in H y , that neighbor has a spider edge to a vertex in H z that is adjacent to ( g , h ) . Hence, a path exists for each vertex in H z . Each path traces a K 3 subgraph formed from two ( g , h ) spider edges plus one ( z , h z ) spider edge to ( y , h y ) so all H layer edges are excluded. There are n H number of ( z , h z ) vertices, each of which has n H number of spider edges to the vertices in H y , and each is distinct from the others and distinct from the ( g , h ) to ( z , h z ) spider edge. As all previous counts contained at least one H layer edge, these n H 2 number of triangles are not previously counted by the other terms in equation 3. For a specific vertex g G in G, if g G is in more than one triangle in G, then k 3 ( g G ) · ( n H 2 ) in equation (3) counts the additional triangles in the product graph.
Concerning any vertex ( g , h ) V ( G H ) , there are deg ( g G ) · m H number of H edges in triangles with ( g , h ) that are counted in partition (1). Vertex ( g , h ) has deg ( g , h ) = deg ( h H ) + deg ( g G ) ( n H ) incident spider edges, all of which are counted in partitions (1), (3) and (4). There are deg ( g G ) · deg ( h H ) · n H neighbors of ( g , h ) , all of whose spider edges to other vertices in N ( ( g , h ) ) are counted. The number of factor graph triangles that include g G and h H are counted. If an edge e has been missed, then e must be in a G layer but not in a triangle in that layer, all of which are counted by equation (3). Therefore, k 3 ( g , h ) = deg ( g G ) m H + deg ( h H ) · n H + k 3 ( h H ) + k 3 ( g G ) · ( n H 2 ) . □

4. Clustering Coefficient for Lexicographic Product

We now address the clustering measures for G H . Equation (4) gives the clustering coefficient for ( g , h ) while equation (5) presents the a c c ( G H ) utilizing vertex partitions.
Consider K 1 that has average clustering coefficient of zero. As K 1 is the unit for the lexicographic product, then for any K n , product K 1 K n (or K n K 1 due to commutativity of K n 1 K n 2 ) results in K n . This produces the vertex clustering coefficient and average clustering coefficient of K n . When n = 2 , K 2 = P 2 which has a c c ( P 2 ) = 0 . Thus a c c ( K 1 K 2 ) = a c c ( K 2 ) = 0 .
For n 1 , n 2 3 , allowing both equality and inequality of n 1 and n 2 , it is a fact that all K n 1 K n 2 result in K n 1 n 2 . Thus a c c ( K n 1 K n 2 ) = a c c ( K n 1 n 2 ) = 1 when n 1 , n 2 3 . However, take note that K 2 K 2 produces K 4 . Although a c c ( K 2 ) = 0 , this product results in K 4 with a c c ( K 4 ) = 1 . In fact, for any K 2 K n K n K 2 where n 2 , even though a c c ( K 2 ) = 0 , a c c ( K 2 K n ) = 1 .
Now contemplate D n 1 , the graph of n 1 isolated vertices, and its lexicographic product with D n 2 , where equality of the orders is permitted. As no edges exist, the definition of lexicographic product fails to generate edges in D n 1 D n 2 . Hence, a c c ( D n 1 D n 2 ) = 0 . Also note that K 1 = D 1 .
The proof of the following theorem is found in the definition of clustering coefficient given by equation (1) and the proof of equation (3), plus the fact that the maximum K 3 inclusion number is deg ( g , h ) 2 .
 Theorem 2.
Suppose G and H are both simple graphs. Then the vertex clustering coefficient c c ( g , h ) of any vertex ( g , h ) in G H is:
c c ( g , h ) = deg ( g G ) m H + deg ( h H ) · n H + k 3 ( h H ) + k 3 ( g G ) · ( n H 2 ) deg ( g , h ) 2 .
Based on the definition of the average clustering coefficient, we give Theorem 3 without proof.
 Theorem 3.
For finite and simple graphs G and H, the average clustering coefficient of G H with order n G n H over all vertices ( g , h ) is:
a c c ( G H ) = i = 1 n 1 n 2 c c ( ( g , h ) i ) n G n H .

References

  1. F. Berger (2004) Minimum Cycle Bases of Graphs. Dissertation, Technische Universität München.
  2. G. Chartrand, L. Lesniak, P. Zhang (2016) Graphs & Digraphs, 6th Ed.. DCRC Press, Boca Raton, FL.
  3. R. J. M. Demalerio, R. G. Eballe (2022) Global clustering coefficient of the products of complete graphs. Asian Research Jrnl. of math. v.18(6). 62-69. [CrossRef]
  4. R. J. M. Demalerio, R. G. Eballe (2022) Clustering coefficient of the tensor product of graphs. Asian Research Jrnl. of math. v.18(6). 36-42. [CrossRef]
  5. R. Hammack, W. Imrich, S. Klavžar (2011) Handbook of Product Graphs, 2nd Ed.. CRC Press, Boca Raton, FL.
  6. F. Harary (1959) On the group of the composition of two graphs. Duke Math. Jrnl. v.26(1) March. 29-34.
  7. P.W. Holland, S. Leinhardt (1971) Transitivity in structural models of small groups. Comparative Group Studies v.2(2). p.107-124. [CrossRef]
  8. M. Hellmuth, P.J. Ostermeier, P.F. Stadler (2012) Minimum cycle bases of lexicographic products. ARS Math. Contemporanea. https://amc-journal.eu. 223-234.
  9. M. M. M. Jaradat (2008) Minimal cycle bases product of graphs. Discuss. Math., Graph Theory v.28. 229-247. [CrossRef]
  10. A. Kaveh, R. Mirzaie (2008) Minimal cycle bases of graph products for the forced method of frame analysis. Commun. Numer. Methods Eng. v.24. 653-669. [CrossRef]
  11. F. Li (2020) On forwarding indices of lexicographic product networks. Concurrency and computation 32(23). [CrossRef]
  12. M. Newman (2018) Networks, 2nd Ed.. Oxford Univ. Press, Oxford, United Kingdom.
  13. G. Sabidussi (1959) The composition of graphs. Duke Math. Jrnl. v.26(4) December. 693-696.
  14. S. Strogatz, D.J. Watts (1998) Collective dynamics of ’small-world’ networks. Nature 393(6684). 440–442. [CrossRef]
Figure 1. P 2 P 3 and P 3 P 2 showing noncommutativity of lexicographic product.
Figure 1. P 2 P 3 and P 3 P 2 showing noncommutativity of lexicographic product.
Preprints 106187 g001
Figure 2. G H with four K 3 subgraph partition examples for only vertex 11.
Figure 2. G H with four K 3 subgraph partition examples for only vertex 11.
Preprints 106187 g002
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated