Preprint
Article

Fractal Dimension of the Generalized Z-Entropy of The Rényian Formalism of Stable Queue with Some Potential Applications of Fractal Dimension to Big Data Analytics

Altmetrics

Downloads

100

Views

19

Comments

0

This version is not peer-reviewed

Submitted:

26 January 2024

Posted:

29 January 2024

You are already at the latest version

Alerts
Abstract
In the current work, the stable queueuing system's Generalised Z-Entropy (GZE) of Rényian formalism is examined in its fractal dimension. Notably, fractal dimension’s resulting behaviour that corresponds to the GZE parameters is examined through numerical tests. This research makes a substantial generalization in the literature by fusing fractal geometry and information theory to shed light on how entropy and complexity interact. More fundamentally, the significant role of fractal dimension to advance Big Data Analytics(BDAs) is highlighted. Closing remarks combined with open problems and the next phase of research are provided.
Keywords: 
Subject: Computer Science and Mathematics  -   Geometry and Topology

1. Introduction

The Shannonian entropy [1], namely H X reads as:
H X = i p x i l n p x i  
  p i serves as the ith-event probability.
The probability of the ith -event is given in this expression as p i . This entropy establishes the definition of information in information theory. There are many methods for measuring information, as well as in this scenario, we argue regarding how entropy and D interact with one another. D 2 7   , is a measure that assesses how a fractal pattern expands beyond the area it occupies, indicating the complexity of the pattern in spatial dimensions. D [2,3,4,5,6,7], involves calculations that consider sticks’ number ( N ) required in coastline coverage and the factor of scaling (ε). By analyzing these factors, the fractal dimension provides insights into the intricate nature of patterns and their representation of complexity in spatial dimensions.
N ε D
l n   N = D = l n N l n ε
[8] created a map and rigid sticks for an experiment like Richardson's in the book by using GEs pictures combined with GIMP (c.f., Figure 1). The practical application of this technique for measuring the fractal dimension was demonstrated on a part of the Grand Canyon.
The road map of this paper is: Section II provides an overview of previous research on deriving fractal dimension using entropy. Section III introduces the definition of the Rényi formalism of M / G / 1 queue in the stability phase and presents new results combined with a numerical analysis that demonstrates the substantial influence of the M / G / 1 parameters on the behavior of the fractal dimension. Section IV explores some potential D applications to BDAs, Finally, closing remarks combined with the next phase of research are provided in section V.

2. D of Entropies

In [8], the fractal dimension ( D )associated with different types of entropy measures was determined, for entropies measures [9,10,11,12]. These derivations were conducted under proposing that all outcomes have equal probabilities.
The Shannonian fractal dimension [8] reads:
D s = l i m ε 0 l n N l n 1 ε
Rényian dimension of order   q 0.5 , 1     reads [10]
D R = l i m ε 0 l n N l n 1 ε
As for the Tsallisian case [8], of order   q 0.5 , 1 ,
D T = l i m ε 0 1 1 q N 1 q 1 l n 1 ε
With q 0.5 , 1 , D T > 0
The Kaniadakisian fractal dimension [8] for the entropic index, κ reads as:
D K = l i m ε 0 1 2 k N k N k l n 1 ε
Notably, D R , D T   a n d   D T are entropic index impacted. The Koch snowflake (KSF) fractal dimension follows for ( N = 4 and ε = 1/3).

3. Rényian Formalism of Stable M / G / 1 Queueuing System

Non-extensivity coins interactions of long range [13].
The Rényi’s [9] non-extensive maximum entropy functionals reads:
H q , R p = c 1 q ln i = 1 N p i q            
respectively, for a constant c > 0 .
Theorem 1(c.f., [14])
The Rényi’s non-extensive maximum entropy solution,   p q , R n , for     M / G / 1 queue in the stability under normalization, server utilization and Mean Queue Length reads:
p q , R n = p q , R 0                             n = 0 p q , R 0 τ s 1 q   x n         n > 0
Such that
  p q , R 0 = 1 ρ   ,   ρ   is   the   server   utilization      
where   τ s and x to be:
τ s = 2 / 1 + C s , 1 , S 2
with
x = ρ ρ + 1 ρ 2 1 + C s , 1 , S 2 1 q                                            
With
ρ 1 x 1 ρ x = τ s 1 q  
The Generalized Z-Entropy (GZE) [14] reads as:
  H q , a , b , Z p = Z a , b = 1 1 q a b n p q , Z n q a n p q , Z n q b    
Such that 1 > q > 0.5 ,   a > 0 ,   b     o r   b > 0 ,   a       w i t h   a b .

4. New Results

Theorem 2
Engaging [14], (9)-(14), the GZE fractal dimension, D Z a , b is devised by:
Preprints 97433 i001
Proof
By the definition,
Preprints 97433 i002
Hence, it follows that:
Preprints 97433 i003
Hence, (15) follows.
Accordingly, let’s discuss the following cases:
Case 1: C s 2 = 3 ,   ρ = 0.5 The information-theoretic impact on D q , 1 , 2 , 4 , 1 3 is visualized by Figure 3.
Approaching the instability zone, C s 2 1
Preprints 97433 i004
More fundamentally,
lim a 0 D Z q , a , 0 , 4 , 1 3 = 1 1 q l n 3   l n ρ q 1 ρ τ s 1 q ρ + 1 ρ τ s 1 q q 1   ρ ρ + 1 ρ τ s 1 q 4 q
For τ s = 0.5 = ρ , Clearly, we have D Z 1 , 0 , 0 , 4 , 1 3 Figure 4, Figure 5 and Figure 6 portrays the significant impact of q   on D Z q , 1 , 2 , 4 , 1 3 for the prescribed values τ s = 0.5 = ρ .
The impact of information-theoretic parameter q on D Z q , 0 , 0 , 4 , 1 3 is clear, as D Z q , 0 , 0 , 4 , 1 3 decreases while q is in the extensivity phase and starts to increase drastically when q is non-extensive.
There is no clearly defined scaling-dimension since the Apollonian gasket [15,16,17] is only roughly self-similar. However, any "triangular" region enclosed within three circles appeared to be a curved Sierpinski gasket. Remember that scaling it by a factor of 2 requires 3 copies of the Sierpinski gasket as shown by Figure 7 and Figure 8(c.f., [17]). Consequently, we would expect that the Apollonian gasket's fractal dimension will be near to:
D = l n 3 l n 2 1.585
Case 2: C s 2 = 1 ,   ρ = 2 (instability)
Sierpiniski Gasket(SG)
D Z q , 1 , 0 , 3 , 1 2 = 1 q 1 l n 2 2 q 1 q 1 2 2 q + 1
Mathematically speaking, the un-defindedness of SG at many points is based on the fact of attaining complex values at these points, for example:
1 0.55 = c o s π 2 + i s i n π 2 0.55 = c o s 0.55 π 2 + i s i n 0.55 π 2           ( De   Moivre   Theorem )
So, we are in a situation of a complex valued SG fractal dimension. After some mathematical manipulation, one gets:
D Z 0.55 , 1 , 0 , 3 , 1 2 C s 2 = 1 ,   ρ = 2 = 0.4602098599 1.464085696 c o s 0.55 π 2 + i s i n 0.55 π 2   = 0.2136291295 + i 0.00693904008
Notably, the fluctuations of the derived values of SG fractal dimension between decreasing and the drastic decreasing along the path of approaching sufficiently large values of q while approaching the extensivity zone, q > 1 . For q = 1 , we arrive at invite value for the corresponding SG fractal dimension. Clearly, this shows the significant information-theoretic impact in both non-extensive and extensive phases. This paper provides another revolutionary approach to the traditional definition of both Apollonian and SG dimensions, while mine includes several respective parameters, including queueing and information-theoretic parameters.

5. D Applications to PDAs

A data structure called the box locality index (BLI) is used to adapt the box-counting method for fractal dimension calculation [18] for huge data. By encoding the information required for fractal dimension computation, the BLI streamlines the hierarchical structure. Scalable fractal dimension calculation for large data is made possible by the BLI by utilising distributed computing techniques like MapReduce and Spark. This is valuable for a variety of machine learning techniques and data analytics jobs like feature selection and dimensionality reduction.
One methodology that is frequently used to determine the fractal dimension of a dataset is the box-counting method. It involves dividing the dataset's embedding space into a grid of boxes and counting the number of points in each box. By analyzing the relationship between the size of the boxes and the number of points, the fractal dimension can be estimated. This approach is illustrated with three example datasets in Figure 9 [18], where the first and third datasets represent one- and two-dimensional objects, respectively, while the second dataset is a well-known fractal called the Sierpinski triangle with a fractal dimension of approximately 1.58. The given one-dimensional dataset generated by a mathematical function. The dataset consists of points (y) that are calculated using the equation y = 0.1 + 0.8sin(πx) + σ, where x values are uniformly distributed between 0 and 1, and σ represents random noise following a normal distribution with mean 0 and standard deviation 0.01. This dataset is an example of a 1D dataset with a sinusoidal pattern and random noise.
İn [19], the challenge of online clustering in high-dimensional data and the limitations of existing algorithms in handling this task are thoroughly discussed. Therefore, [19] proposed a novel approach called FractStream to discover core fractal clusters, progressive fractal clusters, and outlier fractal clusters using fractal dimension, basic window technology, and a damped window model. The proposed technique [19] aimed to reduce search complexity, execution time, and memory usage, and its effectiveness and efficiency are demonstrated through experimental studies on various datasets.
An exposition of the construction of a multi-layered nested grid structure for determining the fractal dimension of a dataset is undertaken by [19]. The fractal dimension is calculated by counting the number of data points within each grid of the lowest layer of the grid structure. Additionally, the use of sliding window model for computing cluster partitions on evolving data streams, as discribed by Figure 10 [19].
In the given context, a clustering algorithm based on correlation fractal dimension for an evolving data stream was developed [19]. The algorithm involves inserting points into existing clusters based on their relative change in fractal dimension, creating new progressive fractal clusters if certain conditions are met, and creating outlier fractal clusters for points that do not fit into existing clusters. The weight of the clusters is periodically checked, and if a progressive fractal cluster's weight falls below a threshold, it is deleted to make space for new clusters.
The behavior of clusters changes over time when performing online clustering with a window of 1000 data points. This evolution of data can be segmented into intervals, as shown in Figure 11 [19], to analyze the changes in cluster composition.
In the area of big data applications, disturbances like COVID-19, pollution, or policy changes have a huge effect on economic and financial systems [20]. For expanding the use of big data in financial and economic systems, it is imperative to investigate how these disruptions affect associated time series. The complexity of these time series is analysed using the Generalised Weierstrass-Mandelbrot Function (GWMF) [20], which demonstrates how disturbances in the form of exponential functions can produce multifractal characteristics. Additionally, the model replicates long memory and irregularity, which are evaluated by multifractal analysis and the R/S approach.
Research on how disturbances affect time series produced by the actual part of GWMF, or C(t, μ), and how to replicate multifractal features in time series is scarce [19]. Furthermore, there is little theoretical evidence to support the claim that time series produced by WMF naturally possess the fractal dimension D. Consequently, more statistical examination is required to determine the connection between disturbances and the nonlinear properties of time series produced by C(t, μ), including multifractal analysis and Hurst exponent.
Disturbance's effect on the statistical characteristics of time series produced by the identified generalized Wavelet Multifractal model. This study demonstrates that disturbances, including feedback terms, can play a complex role in changing the structure of nonlinear time series, shifting from a single fractal dimension to long memory and multifractal features, defying the widely held belief that disturbances have no effect on statistical results. These results emphasize how crucial it is for those working in the financial and economic sectors to comprehend the basic theory of time series, especially when it comes to large data applications [20].

6. Closing Remarks Combined with Open Problems and the Next Phase of Research

This paper explores the relationship between D Z q , a , b , N , ε     and the information-theoretic queueing parameters. Numerical experiments analyze the behavior of the derived fractal index to evidence that this work represents a significant advancement in unifying information theory and fractal geometry.
An explanation is given to confirm the influential role of fractal dimension in developing and revolutionizing BDAs. The current paper has several emerging open problems.

Open Problem One

Based on the findings of this paper, is it feasible to undertake their approach much further to find the fractal dimension theory of Ismail’s Entropy, namely IE(c.f., [21,22]), which is by default the ultimate generalization of numerous in literature?

Open Problem Two

Based on the possibility to unlock open problem one, can we find any mathematical approach to decide the threshold of the involved universal parameters of IE. If so, what will be the expected form of the mathematical relations involved?

Open Problem Three

Can we extend the case to investigate possible applicability of other fractal dimensions in literatures, such as Sierpinski Gasket and Koch Snowflake?
Future research aims to determine the fractal dimensions of other entropies in literature and compare them to further advance the field of Information Theoretic Fractal Geometry (ITFG). Notably, the search is ongoing to possibly find solutions to the above provided open problems.

Funding

This research received no external funding.

References

  1. Mageed, Q. Zhang, and B. Modu, “ The Linearity Theorem of Rényian and Tsallisian Maximum Entropy Solutions of The Heavy-Tailed Stable M/G/1 Queueing System entailed with Potential Queueing-Theoretic Applications to Cloud Computing and IoT,” electronic Journal of Computer Science and Information Technology, 2023, vol. 9, no. 1, p. 15-23.
  2. Mageed et al, “Generalization of Renyi’s Entropy and its Application in Source Coding,” Appl. Math. Inf. Sci. vol. 17, No. 5, 2023, p. 941-948.
  3. L. Zhou, C. R. L. Zhou, C. R. Johnson, and D. Weiskopf, “ Data-driven space-filling curves,” IEEE transactions on visualization and computer graphics, vol. 27, no. 2, 2020, p. 1591-1600.
  4. W. J. Wang et al, “ Fractal growth of giant amphiphiles in langmuir-blodgett films,” Chinese Journal of Polymer Science, vol. 40, no. 6, 2022, p. 556-66. [CrossRef]
  5. I.A. Mageed and Q. Zhang, “Formalism of the Rényian Maximum Entropy (RMF) of the Stable M/G/1 queue with Geometric Mean (GeoM) and Shifted Geometric Mean (SGeoM) Constraints with Potential GeoM Applications to Wireless Sensor Networks (WSNs),” electronic Journal of Computer Science and Information Technology, vol. 9, no. 1, 2023, p. 31-40.
  6. G. L. Gao et al, “ Do the global grain spot markets exhibit multifractal nature? ”Chaos, Solitons & Fractals, vol. 164, 2022, p. 112663.
  7. S. R. Nayak and J. Mishra, “ Analysis of medical images using fractal geometry,” In Research Anthology on Improving Medical Imaging Techniques for Analysis and Intervention, IG Global, 2023, p. 1547-1562.
  8. T. Zhao, Z. T. Zhao, Z. Li, and Y. Deng, “ Information fractal dimension of Random Permutation Set,” Chaos, Solitons & Fractals, vol. 174, 2023, p. 113883. [CrossRef]
  9. I.A. Mageed and Q.Zhang, “Threshold Theorems for the Tsallisian and Rényian (TR) Cumulative Distribution Functions (CDFs) of the Heavy-Tailed Stable M/G/1 Queue with Tsallisian and Rényian Entropic Applications to Satellite Images (SIs),” electronic Journal of Computer Science and Information Technology, vol. 9, no. 1, 2023, p. 41-27.
  10. F. Pons, G. F. Pons, G. Messori, and D. Faranda, “ Statistical performance of local attractor dimension estimators in non-Axiom A dynamical systems,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vo. 33, no. 7, 2023. [CrossRef]
  11. Mageed and, Q. Zhang, “An Introductory Survey of Entropy Applications to Information Theory, Queuing Theory, Engineering, Computer Science, and Statistical Mechanics,” In 2022 27th IEEE International Conference on Automation and Computing (ICAC) 2022, pp. 1-6.
  12. Mageed and, Q. Zhang, “Inductive Inferences of Z-Entropy Formalism (ZEF) Stable M/G/1 Queue with Heavy Tails. In 2022 IEEE 27th International Conference on Automation and Computing (ICAC), 2022, p. 1-6.
  13. Mageed, Q. Zhang, and B. Modu, “ The Linearity Theorem of Rényian and Tsallisian Maximum Entropy Solutions of The Heavy-Tailed Stable M/G/1 Queueing System entailed with Potential Queueing-Theoretic Applications to Cloud Computing and IoT,” electronic Journal of Computer Science and Information Technology, 2023, vol. 9, no. 1, p. 15-23.
  14. Mageed and, A.H. Bhat, “Generalized Z-Entropy (Gze) and Fractal Dimensions,” Appl. Math, vol. 16, no. 5, 2022, p. 829-834. [CrossRef]
  15. H. Deng, W. H. Deng, W. Wen, and W. Zhang, “Analysis of Road Networks Features of Urban Municipal District Based on Fractal Dimension,” ISPRS International Journal of Geo-Information, vol. 12, no. 5, 2023, p. 188. [CrossRef]
  16. Mageed, “ The Entropian Threshold Theorems for the Steady State Probabilities of the Stable M/G/1 Queue with Heavy Tails with Applications of Probability Density Functions to 6G Networks,” electronic Journal of Computer Science and Information Technology, vol. 9, no. 1, 2023, p. 24-30.
  17. D. Lippman, “Mathematics for the Liberal Arts,” Website title: Lumen.Online available at: https://courses.lumenlearning.com/wmopen-mathforliberalarts/chapter/introduction-fractals-generated-by-complex-numbers/. [Last accessed on ]. 20 August.
  18. R. Liu, R. R. Liu, R. Rallo, and Y. Cohen, “ Fractal dimension calculation for big data using box locality index,” Annals of Data Science, 2018, p. 549-63. [CrossRef]
  19. Yarlagadda, M.V. Jonnalagedda, and K. Munaga, “Clustering based on correlation fractal dimension over an evolving data stream,” Int. Arab J. Inf. Technol., vol. 1, no. 15, 2018, p. 1-9.
  20. L. Zhang, “Generalized Weierstrass-Mandelbrot with Disturbance for Big Data Applications in Economic and Financial Systems,” In 2023 IEEE 8th International Conference on Big Data Analytics (ICBDA), 2023, p. 53-56.
  21. Mageed and, Q. Zhang, “An Information Theoretic Unified Global Theory For a Stable M/G/1 Queue With Potential Maximum Entropy Applications to Energy Works,” In 2022 IEEE Global Energy Conference (GEC), 2022, p. 300-305.
  22. D. D. Kouvatsos I. A. and Mageed, “Non-Extensive Maximum Entropy Formalisms and Inductive Inferences of Stable M/G/1 Queue with Heavy Tails, in “Advanced Trends in Queueing Theory,” Vladimir Anisimov and Nikolaos Limnios (eds.), Books in ‘Mathematics and Statistics’, Sciences by ISTE & J. Wiley, London, UK, vol. 2, 2021.
Figure 1. An example of satellite images from Google Earth, specifically showing a portion of the Grand Canyon in Arizona. The mention of "rigid sticks" refers to the creation of visual elements using GIMP, a software for image editing and manipulation [8]. The visualization of how N, D and ε are corelated is illustrated by Figure 2.
Figure 1. An example of satellite images from Google Earth, specifically showing a portion of the Grand Canyon in Arizona. The mention of "rigid sticks" refers to the creation of visual elements using GIMP, a software for image editing and manipulation [8]. The visualization of how N, D and ε are corelated is illustrated by Figure 2.
Preprints 97433 g001
Figure 2. The correlation between N, D and ε.
Figure 2. The correlation between N, D and ε.
Preprints 97433 g002
Figure 3. The influence of q on D Z q , 1 , 2 , 4 , 1 3 .
Figure 3. The influence of q on D Z q , 1 , 2 , 4 , 1 3 .
Preprints 97433 g003
Figure 4.
Figure 4.
Preprints 97433 g004
Figure 5.
Figure 5.
Preprints 97433 g004
Figure 6.
Figure 6.
Preprints 97433 g004
Figure 7. Bubbles are arranged in a fractal way to form foam [17].
Figure 7. Bubbles are arranged in a fractal way to form foam [17].
Preprints 97433 g007
Figure 8. A fractal that can be used to simulate soap bubble foam is the Apollonian gasket [17].
Figure 8. A fractal that can be used to simulate soap bubble foam is the Apollonian gasket [17].
Preprints 97433 g008
Figure 9. Box-counting plots for three example datasets. The first dataset (1D) is generated by a mathematical function involving sine and random noise. The second dataset is the well-known Sierpinski triangle, and the third dataset (2D) is a uniformly distributed two-dimensional dataset. The box-counting plots help estimate the fractal dimensions of these datasets by analyzing the slopes of the fitted lines.
Figure 9. Box-counting plots for three example datasets. The first dataset (1D) is generated by a mathematical function involving sine and random noise. The second dataset is the well-known Sierpinski triangle, and the third dataset (2D) is a uniformly distributed two-dimensional dataset. The box-counting plots help estimate the fractal dimensions of these datasets by analyzing the slopes of the fitted lines.
Preprints 97433 g009
Figure 10. In the context of data analytics and machine learning, sliding window and basic windows are techniques used for analyzing data streams or time series data. A sliding window refers to a fixed-size window that moves along the data stream, allowing for continuous analysis of a subset of data. On the other hand, basic windows are non-overlapping windows of fixed size that partition the data stream into distinct segments for analysis. These techniques are commonly employed to extract meaningful patterns and insights from streaming or time-dependent data.
Figure 10. In the context of data analytics and machine learning, sliding window and basic windows are techniques used for analyzing data streams or time series data. A sliding window refers to a fixed-size window that moves along the data stream, allowing for continuous analysis of a subset of data. On the other hand, basic windows are non-overlapping windows of fixed size that partition the data stream into distinct segments for analysis. These techniques are commonly employed to extract meaningful patterns and insights from streaming or time-dependent data.
Preprints 97433 g010
Figure 11. The evolution of clusters over time in the context of online clustering. Initially, there are 5 clusters in a steady state, and as the stream progresses, data points are added to different clusters, resulting in the formation of new clusters and changes in the existing ones. The evaluation of clustering quality is done using the average purity of clusters, which measures the agreement between the cluster labels and the true labels of the data.
Figure 11. The evolution of clusters over time in the context of online clustering. Initially, there are 5 clusters in a steady state, and as the stream progresses, data points are added to different clusters, resulting in the formation of new clusters and changes in the existing ones. The evaluation of clustering quality is done using the average purity of clusters, which measures the agreement between the cluster labels and the true labels of the data.
Preprints 97433 g011
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated