Unit Exponential Probability Distribution: Characterization and Applications in Environmental and Engineering Data Modelling

Distributions with bounded support show considerable sparsity over those with unbounded support, despite the fact that there are a number of real-world contexts where observations take values from a bounded range (proportions, percentages, and fractions are typical examples). For proportion modelling, a flexible family of two-parameter distribution functions associated with the exponential distribution is proposed here. Mathematical and statistical properties of the novel distribution are examined, including quantiles, mode, moments, hazard rate function, and its characterization. The parameter estimation procedure using the maximum likelihood method was carried out, and applications to environmental and engineering data were also considered. To this end, various statistical tests are used, along with some other information criterion indicators to determine how well the model fits the data. The proposed model is found to be the most efficient plan in most cases for the datasets considered.

Keywords:

Subject: Computer Science and Mathematics - Probability and Statistics

MSC: 60E05; 62E15; 62F10

1. Introduction

Proportional variables are often encountered in data science, where they are used as stochastic models that describe, for instance, the number of successes divided by the number of attempts, party votes, the proportion of money spent on a cause, or the attendance rate of public events. Therefore, proportion analysis is necessary in various fields such as healthcare, economics, and engineering, among many others. Usually, to model the behaviour of such random variables (RVs), distributions defined on a unit interval are used, which are very valuable in modelling proportions and percentages. It is conceivable to model and forecast such variables, but one must look outside the traditional model because the data is limited to the range

(0, 1)

. For further study, readers are referred to [1,2,3].

In this context, the beta model is proposed by Bayes [4], which in many fields of statistics is a convenient and helpful model widely used for modelling percentages and proportions. However, there are a number of scenarios where it seems not to be suitable one. Therefore, alternatively, several distributions are developed for modelling bounded variables like proportions, indices and rates, for instance unit distribution studied in [5], the unit Johnson distribution proposed in [6], the four-parameter distribution introduced in [7], the distribution proposed in [8], Topp-Leone distribution studied in [9], and unit gamma distribution introduced in [10]. More recently, many other unit interval distribution functions have been introduced, for instance the cumulative distribution function (CDF) quantile distribution [11], new unit interval distribution [12], the unit-inverse Gaussian distribution [13], the log-xgamma distribution [14], unit Gompertz, unit Lindley and unit Weibull distributions [15,16,17], the log-weighted exponential distribution [18], the unit Johnson SU distribution [19], the unit log–log distribution [20], and the new unit distribution [21]. Notice that all of these distributions are potential candidates for describing proportions. It is worth noting that the approaches mentioned above are mainly based on conventional strategies, namely:

i) log transformation approaches,

ii) CDF and quantile methodology,

iii) reciprocal transformation, and

iv) T-X family approach.

However, all of earlier models and others seem to be casual ways of generating unit interval distributions. In the current study, our motivational strategies begins with recalling the epsilon function examined in [22], which is defined as

ε_{λ, a} (x) = \{\begin{matrix} {(\frac{a + x}{a - x})}^{\frac{λ a}{2}}, & - a < x < a \\ 0, & otherwise, \end{matrix}

(1)

where

λ \in R ∖ {0}

and

a > 0

. The function

y = ε_{λ, a} (x)

is the solution of epsilon differential equation of the first order:

y^{'} = \frac{λ a^{2} y}{a^{2} - x^{2}},

and it satisfies the following property of the exponential limit:

lim_{a \to + \infty} ε_{λ, a} (x) = e^{λ x}, \forall x \in (- a, + a) .

Further, it is also related to the CDF class proposed in [7], which is based on the exponential function. However, the unit interval variants thus proposed differ from the design of our CDF. As will be seen, the distribution proposed here is much more flexible, and exhibits both positive and negative skewness. Moreover, as will be seen below, the hazard rate function (HRF) of proposed model purely yields an increasing failure rate (IFR) behaviour, or all values of

λ > 0

thus belongs to decreasing mean residual life (DMRL) class.

The rest of the manuscript is organized as follows. In the next section, the basic stochastic properties of the proposed distribution are presented. The mode, quantiles, HRF, and characterization of the new distribution, among other properties, are examined. Section 3 shows the procedure for estimating the parameters of the proposed distribution using the maximum likelihood (ML) method. Applications to a number of real-world data sets is given in Section 4, while the last section provides some concluding remarks.

2. The Proposed Unit Exponential Distribution (UED)

Let X be a bounded RV and, without loss of generality, it is convenient that values of X belongs to the unit interval

[0, 1]

. Also, suppose that the CDF of RV X is defined by the following equality:

F (x) = \{\begin{matrix} 1 - exp [α (1 - {(\frac{1 + x}{1 - x})}^{β})], & 0 \leq x < 1; \\ 1, & x = 1; \end{matrix}

(2)

where

α, β > 0

. The CDF given by Equation (2) is called the unit-exponential distribution (with the parameters

α

and

β

), and referred to as

U E D (α, β)

. Note that the UED is related to the epsilon function defined in Equation (1). Indeed, when taking

a = 1

and

β = λ / 2

, Equation (2) becomes:

F (x) = 1 - exp [α (1 - ε_{2 β, 1} (x))],

when

0 \leq x < 1

. By differentiating the CDF given by Equation (2), the probability density function (PDF) of the UED, when

0 \leq x < 1

, can be easily obtained as:

f (x) = \frac{2 α β}{1 - x^{2}} {(\frac{1 + x}{1 - x})}^{β} \bar{F} (x) .

(3)

Here,

\bar{F} (x) = 1 - F (x)

is the tail of the CDF

F (x)

. Notice that the UED has two parameters

α, β > 0

, the one is like a dispersion and the other like a shape parameter. Also, this PDF structure is similar to one of the simpler forms of the so-called proper dispersion models introduced in [7], but it does not belong to that class.

2.1. Properties of the Model

In practice, it is required that the proposed UED, whose PDF is defined by Equation (3), presents flexibility to describe the data adequately. In this regard, it exhibits negatively and positively skewed for all values of

α > 0

and

β > 0

. The flexibility property of the UED can be visualized as in Figure 1, where are shown the various cases of the appropriate PDF, in dependence of the parameters values

α

and

β > 0

. These plots show the different skewness possibilities and the existence of modes of the UED that can be used to fit some real-world datasets.

2.1.1. Quantile

As a first property, the quantile function of the UED is quite manageable. By inverting the CDF

F (x)

, given by Equation (2), the quantile function is determined as:

Q (y) = F^{- 1} (y) = \frac{{(1 - ln (1 - y) / α)}^{1 / β} - 1}{{(1 - ln (1 - y) / α)}^{1 / β} + 1}, y \in (0, 1) .

Thanks to this function, the median of the UED is given by

M_{e} = Q (1 / 2) = \frac{{(1 + ln 2 / α)}^{1 / β} - 1}{{(1 + ln 2 / α)}^{1 / β} + 1} .

Using

Q (y)

, we are able to define various measures of skewness and kurtosis, as well as important actuarial measures( see, e.g., [23] and [2]).

2.1.2. Mode

Note that Figure 1 shows that the PDF of the proposed model can have (at most one) mode. To identify this property, we should prove the following result, which collects these findings and their implications.

Proposition 1.

The PDF

f (x)

, given by Equation (3), has a unique mode if and only if

0 < α < 1

. Otherwise, the UED does not have any mods.

Proof.

Mode of the PDF

f (x)

is a solution of the equation

f^{'} (x) = 0

, which after certain calculations and simplification becomes:

x + β - α β {(\frac{1 + x}{1 - x})}^{β} = 0 .

(4)

If denote by

ψ (x)

the left-hand side of Equation (4), it is easily obtained:

lim_{x \to 1^{-}} ψ (x) < 0 and lim_{x \to 0^{+}} ψ (x) = β (1 - α) .

Obviously, inequalities

0 < α < 1

and

β > 0

gives

β (1 - α) > 0

. Then, Equation (4) has real solutions, which guarantee that

f (x)

has at least one mode. Next, the function

ψ (x)

defined above has derivative:

ψ^{'} (x) = 1 - \frac{2 α β^{2}}{1 - x^{2}} {(\frac{1 + x}{1 - x})}^{β} .

Note that

ψ^{'} (x)

is strictly decreasing because:

ψ^{''} (x) = - \frac{4 α β^{2} (x + β)}{{(1 - x^{2})}^{2}} {(\frac{1 + x}{1 - x})}^{β} < 0 .

This fact then implies that the previously detected mode is unique. □

2.1.3. Behaviour of the PDF at $x \to 0^{+}$ and $x \to 1^{-}$

Behaviour of the PDF

f (x)

at the ends of unit interval, that is when

x \to 0^{+}

and

x \to 1^{-}

, indicate how

f (x)

converges or not in these limits. In terms of data modelling, these facts would reflect empirical limits on the extremes that data show. At the limit

x \to 0^{+}

, according to Equations (2) and (3), it is easily obtained:

lim_{x \to 0^{+}} f (x) = 2 α β .

On the other hand, to analyzing the limit of

f (x)

x \to 1^{-}

, we observe the function

ln f (x)

, which can be written as

\begin{matrix} ln f (x) & = & ln (2 α β) + (β - 1) ln (1 + x) - (β + 1) ln (1 - x) + α (1 - {(\frac{1 + x}{1 - x})}^{β}) \\ = & \frac{1}{{(1 - x)}^{β}} ({(1 - x)}^{β} (ln (2 α β) + (β - 1) ln (1 + x) - (β + 1) ln (1 - x) + α) \\ - α {(1 + x)}^{β}) . \end{matrix}

Hence, we get:

lim_{x \to 1^{-}} {(1 - x)}^{β} ln f (x) = - α 2^{β},

which implies that in a data representation, data would decay at exponential rates when

x \to 1^{-}

2.1.4. Moments

Let X be a RV with the CDF given by Equation (2). Then, the

r^{t h}

moment of X, using partial integration, can be expressed as follows:

\begin{matrix} E (X^{r}) & = \int_{0}^{1} x^{r} d F (x) = \int_{1}^{0} x^{r} d (1 - F (x)) = r \int_{0}^{1} x^{r - 1} (1 - F (x)) d x \\ = r exp (α) \int_{0}^{1} x^{r - 1} exp [- α {(\frac{1 + x}{1 - x})}^{β}] d x . \end{matrix}

This integral can be determined numerically with the use of any software. The following result proposes a series expansion of

E (X^{r})

that can be used for numerical approximation.

Proposition 2.

The

r^{t h}

moment of X can be expanded as:

\begin{matrix} E (X^{r}) & = \frac{2 r α^{1 / β} exp (α)}{β} & \sum_{k = 0}^{r - 1} \sum_{ℓ = 0}^{+ \infty} (\binom{r - 1}{k}) (\binom{- (r + 1)}{ℓ}) {(- 1)}^{k} α^{(k + ℓ + 1) / β} Γ (- \frac{k + ℓ + 1}{β}, α), \end{matrix}

where

Γ (a, x)

denotes the upper incomplete gamma function, i.e.,

Γ (a, x) = \int_{x}^{+ \infty} t^{a - 1} exp (- t) d t

Proof.

By applying the change of variable

y = (1 + x) / (1 - x)

, we have:

\begin{matrix} E (X^{r}) = 2 r exp (α) \int_{1}^{+ \infty} \frac{{(y - 1)}^{r - 1}}{{(y + 1)}^{r + 1}} exp (- α y^{β}) d y . \end{matrix}

(5)

Then, using the `generalized version’ of the binomial formula two times in a row, since

y > 1

, we get:

\begin{matrix} \frac{{(y - 1)}^{r - 1}}{{(y + 1)}^{r + 1}} & = y^{- 2} \frac{{(1 - 1 / y)}^{r - 1}}{{(1 + 1 / y)}^{r + 1}} \\ = y^{- 2} [\sum_{k = 0}^{r - 1} (\binom{r - 1}{k}) {(- 1)}^{k} y^{- k}] [\sum_{ℓ = 0}^{+ \infty} (\binom{- (r + 1)}{ℓ}) y^{- ℓ}] \\ = \sum_{k = 0}^{r - 1} \sum_{ℓ = 0}^{+ \infty} (\binom{r - 1}{k}) (\binom{- (r + 1)}{ℓ}) {(- 1)}^{k} y^{- (k + ℓ + 2)} . \end{matrix}

(6)

Also, by the change of variable

z = α y^{β}

, it is obtained:

\begin{matrix} \int_{1}^{+ \infty} y^{- (k + ℓ + 2)} exp (- α y^{β}) d y & = \frac{α^{(k + ℓ + 1) / β}}{β} \int_{α}^{+ \infty} z^{- (k + ℓ + 1) / β - 1} exp (- z) d z \\ = \frac{α^{(k + ℓ + 1) / β}}{β} Γ (- \frac{k + ℓ + 1}{β}, α) . \end{matrix}

(7)

Therefore, by substituting Equations (6) and (7) in Equation (5), as well as by inverting the sign of the integral and the sum, the desired result is obtained. □

2.1.5. Failure (Hazard) Rate Function

The HRF of the UED is given by:

h (x) = \frac{f (x)}{\bar{F} (x)} = \frac{2 α β}{1 - x^{2}} {(\frac{1 + x}{1 - x})}^{β} .

(8)

When

x \to 0^{+}

, the limit of

h (x)

2 α β > 0

, and when

x \to 1^{-}

, the limit is

+ \infty

. Thus, this function is strictly increasing, as it can be seen in Figure 2, meaning that when x increases the frequency with which an engineered system or component fails also increases.

2.2. Characterizations

To interpret the HRF realistically we shall try to characterize Equation (3) by hazard and mean residual life functions. Characterization in general terms implies that under certain conditions a family of distributions is the only one possessing a designated property. Researchers can identify the actual probability distribution with the help of characterization, for detailed study readers are referred to Ahsanullah et al. [24,25] and Hamedani [26]. In this regard, we characterize the proposed model by the HRF and truncated moments, and characterizing conditions are defined as follows.

Proposition 3.

The RV

X : Ω ⟶ (0, + \infty)

has continuous PDF

f (x)

if and only if the HRF

h (x)

satisfies the following equation:

\frac{f^{'} (x)}{f (x)} = \frac{h^{'} (x)}{h (x)} - h (x) .

(9)

Proof.

According to definition of the HRF, given by the first equality in Equation (8), it follows:

\frac{h^{'} (x)}{h (x)} = \frac{f^{'} (x) \bar{F} (x) + f^{2} (x)}{{\bar{F}}^{2} (x)} \cdot \frac{\bar{F} (x)}{f (x)} = \frac{f^{'} (x)}{f (x)} + h (x) .

Thus, the statement of proposition immediately follows. □

Proposition 4.

The RV

X : Ω ⟶ (0, + \infty)

has UED

(α, β)

if and only if the HRF

h (x)

, defined by Equation (8), satisfies the following equation:

\frac{h^{'} (x)}{{(h (x))}^{2}} = \frac{x + β}{α β} {(\frac{1 - x}{1 + x})}^{β} .

(10)

Proof.

Necessity: Assume that

X \sim UED (α, β)

, with the PDF

f (x)

, defined by Equation (3). Then, logarithm of this PDF, at the same way as in SubSection 2.1.3, can be expressed as:

ln (f (x)) = ln (2 α β) + (β - 1) ln (1 + x) - (β + 1) ln (1 - x) + α (1 - {(\frac{1 + x}{1 - x})}^{β}) .

Differentiating both sides of this equality with respect to

x,

we get:

\begin{matrix} \frac{f^{'} (x)}{f (x)} & = \frac{β - 1}{1 + x} + \frac{β + 1}{1 - x} - \frac{2 α β}{{(1 - x)}^{2}} {(\frac{1 + x}{1 - x})}^{β - 1} = \frac{2}{1 - x^{2}} (x + β - α β {(\frac{1 + x}{1 - x})}^{β}) . \end{matrix}

(11)

Thus, according to Equations (8) and (9), it follows:

\frac{h^{'} (x)}{h (x)} = \frac{f^{'} (x)}{f (x)} + h (x) = \frac{2 (x + β)}{1 - x^{2}},

which after certain simplification yields Equation (10).

Sufficiency: Suppose that Equation (10) holds. After integration, it can be rewritten as follows:

\int \frac{h^{'} (x)}{{(h (x))}^{2}} d x = \int \frac{x + β}{α β} {(\frac{1 - x}{1 + x})}^{β} d x,

that is

- \frac{1}{h (x)} = \frac{x^{2} - 1}{2 α β} {(\frac{1 - x}{1 + x})}^{β} .

From the above equation, we obtain the HRF

h (x)

as in Equation (8). Further, replacing this function in Equation (9) and after integration, we obtain:

\begin{matrix} \int \frac{f^{'} (x)}{f (x)} d x & = 2 \int [\frac{x + β}{1 - x^{2}} - \frac{α β}{1 - x^{2}} {(\frac{1 + x}{1 - x})}^{β}] d x + C_{1} \\ = (β - 1) ln (1 + x) - (β + 1) ln (1 - x) - α {(\frac{1 + x}{1 - x})}^{β} + C_{1}, \end{matrix}

that is

f (x) = \frac{exp [C_{1} - α {(\frac{x + 1}{1 - x})}^{β}]}{1 - x^{2}} {(\frac{1 + x}{1 - x})}^{β} .

Another integration implies that:

F (x) = \int f (x) d x + C_{2} = - \frac{exp [C_{1} - α {(\frac{x + 1}{1 - x})}^{β}]}{2 α β} + C_{2},

whereby from the conditions

F (0) = 0

and

F (1) = 1

, the constants

C_{1} = α + ln (2 α β)

and

C_{2} = 1

are obtained. Thus, the function

F (x)

is indeed the CDF from

U E D (α, β)

, which completes the proof. □

The following theorem was used in [27], as well as [24,25], in order to characterize different univariate continuous distributions.

Theorem 1.

Let

(Ω; F; P)

be a given probability space and let

H = [a, b]

be an interval for some

a < b

, where

a = - \infty

and

b = + \infty

might as well be allowed. Also, let

X : Ω \to H

be a continuous RV with CDF

F (x)

, and

g (x)

t (x)

be two real functions defined on

H

and such that:

E [g (X) | X \geq x] = ξ (x) E [t (X) | X \geq x], x \in H

is defined with some real function

ξ (x)

. Assume that

g (x), t (x) \in C^{1} (H), ξ (x) \in C^{2} (H)

and

F (x)

is a twice continuously differentiable and strictly monotone function on the set

H

. Finally, assume that equation

t (x) ξ (x) = g (x)

has no real solution in the interior of

H

. Then,

F (x)

is uniquely determined by the functions

g (x), t (x)

and

ξ (x)

, as follows:

F (x) = C \int_{0}^{x} |\frac{ξ^{'} (u)}{ξ (u) t (u) - g (u)}| e^{- s (u)} d u,

(12)

where the function

s (x)

is a solution of the differential equation:

s^{'} (x) = \frac{ξ^{'} (x) t (x)}{ξ (x) t (x) - g (x)},

and C is a constant such that

\int_{H} d F (x) = 1 .

Now, we discuss the characterization of the UED based on Theorem 1 and some simple relationship between two functions and the RV

X \sim U E D (α, β)

Proposition 5.

Let

X : Ω \to [0, 1)

be a continuous RV and

\begin{matrix} t (x) & = 3 exp [2 α (1 - {(\frac{1 + x}{1 - x})}^{β})], x \in [0; 1) \\ g (x) & = 2 exp [α (1 - {(\frac{1 + x}{1 - x})}^{β})], x \in [0; 1) . \end{matrix}

The RV X has a PDF defined by Equation (3) if and only if there exists the function

ξ (x)

, defined as in Theorem 1, that satisfies the differential equation:

\frac{ξ^{'} (x)}{ξ (x) t (x) - g (x)} = \frac{2 α β}{1 - x^{2}} {(\frac{1 + x}{1 - x})}^{β} exp [- 2 α (1 - {(\frac{1 + x}{1 - x})}^{β})], 0 \leq x < 1 .

(13)

Proof.

Necessity: For the RV

X \sim U E D (α, β)

, with the CDF and PDF given by Equations (2) and (3), respectively, after certain computation, we obtain:

\begin{matrix} (1 - F (x)) E [t (X) | X \geq x] & = 3 e^{α r (x; β)} \int_{x}^{1} \frac{2 α β}{1 - u^{2}} {(\frac{1 + u}{1 - u})}^{β} e^{3 α r (u; β)} d u \\ = exp [4 α (1 - {(\frac{1 + x}{1 - x})}^{β})], \\ (1 - F (x)) E [g (X) | X \geq x] & = 2 e^{α r (x; β)} \int_{x}^{1} \frac{2 α β}{1 - u^{2}} {(\frac{1 + u}{1 - u})}^{β} e^{2 α r (u; β)} d u \\ = exp [3 α (1 - {(\frac{1 + x}{1 - x})}^{β})], \end{matrix}

where

0 < x < 1

and

r (x) : = 1 - {(\frac{1 + x}{1 - x})}^{β}

. This implies:

ξ (x) : = \frac{E (g (x) | X \geq x)}{E (t (x) | X \geq x)} = exp [- α (1 - {(\frac{1 + x}{1 - x})}^{β})], 0 < x < 1,

(14)

that is:

ξ (x) t (x) - g (x) = 3 e^{α r (x; β)} - 2 e^{α r (x; β)} = exp [α (1 - {(\frac{1 + x}{1 - x})}^{β})] > 0, 0 < x < 1 .

Hence, the differential Equation (13) clearly holds.

Sufficiency: If the function

ξ (x)

satisfies the differential Equation (13), then it follows:

s^{'} (x) = \frac{ξ^{'} (x) t (x)}{ξ (x) t (x) - g (x)} = \frac{6 α β}{1 - x^{2}} {(\frac{1 + x}{1 - x})}^{β}, 0 < x < 1,

so one can take:

s (x) = - 3 α (1 - {(\frac{1 + x}{1 - x})}^{β}) .

Using Equation (12), it is easy to obtain that RV X has a PDF given by Equation (3). □

According to the previous proposition, one immediately obtains:

Corollary 1.

Let

X : Ω \to [0, + \infty)

be a continuous RV and functions

t (x)

g (x)

are given as in Proposition 5. Then,

X \sim U E D (α, β)

, with the PDF as in Equation (3), if and only if the function

ξ (x)

has the form as in Equation (14).

3. Estimation Procedure

Let us assume that

x_{1}

, …,

x_{n}

are observed values of the sample of size n, taken from the

U E D (α, β)

. We propose the maximum likelihood method for estimating the couple of parameters

(α, β)

. This means that the estimates of those parameters as the ones that maximize the likelihood function:

L (α, β | x_{1}, \dots, x_{n}) = \prod_{i = 1}^{n} f (x_{i}) .

As is known, this solution also corresponds to the one that maximizes the log-likelihood function, i.e.,

l = l (α, β | x_{1}, \dots, x_{n}) = \sum_{i = 1}^{n} ln f (x_{i}) .

By differentiating function l with respect to each parameter, the estimators of

α

and

β

can be obtained by solving the coupled equations:

\begin{matrix} \frac{\partial l}{\partial α} & = & \frac{n}{α} + \sum_{i = 1}^{n} (1 - {(\frac{1 + x_{i}}{1 - x_{i}})}^{β}) = 0 \\ \frac{\partial l}{\partial β} & = & \frac{n}{β} + \sum_{i = 1}^{n} ln (\frac{1 + x_{i}}{1 - x_{i}}) - α \sum_{i = 1}^{n} {(\frac{1 + x_{i}}{1 - x_{i}})}^{β} ln (\frac{1 + x_{i}}{1 - x_{i}}) = 0 . \end{matrix}

From the first equation, we obtain:

α = {[\frac{1}{n} \sum_{i = 1}^{n} {(\frac{1 + x_{i}}{1 - x_{i}})}^{β} - 1]}^{- 1},

and by replacing this output in the second coupled equation, we get:

\frac{n}{β} + \sum_{i = 1}^{n} ln (\frac{1 + x_{i}}{1 - x_{i}}) + \frac{\sum_{i = 1}^{n} {(\frac{1 + x_{i}}{1 - x_{i}})}^{β} ln (\frac{1 + x_{i}}{1 - x_{i}})}{1 - \frac{1}{n} \sum_{i = 1}^{n} {(\frac{1 + x_{i}}{1 - x_{i}})}^{β}} = 0 .

Obviously, the last equation has only

β

as an unknown parameter. Now, by denoting

z_{i} = (1 + x_{i}) / (1 - x_{i}) > 1

, i = 1, …, n, and

L (β) = \frac{n}{β} + \sum_{i = 1}^{n} ln z_{i} + \frac{\sum_{i = 1}^{n} z_{i}^{β} ln z_{i}}{1 - \frac{1}{n} \sum_{i = 1}^{n} z_{i}^{β}},

by applying the L’Hopital’s rule, one obtains:

\begin{matrix} lim_{β \to 0^{+}} L (β) & = \sum_{i = 1}^{n} ln z_{i} + n lim_{β \to 0^{+}} \frac{\sum_{i = 1}^{n} (1 - z_{i}^{β} + β z_{i}^{β} ln z_{i})}{β \sum_{i = 1}^{n} (1 - z_{i}^{β})} \\ = \sum_{i = 1}^{n} ln z_{i} + n lim_{β \to 0^{+}} \frac{\sum_{i = 1}^{n} (- z_{i}^{β} ln z_{i} + ln z_{i})}{\sum_{i = 1}^{n} (1 - z_{i}^{β} - β z_{i}^{β} ln z_{i})} \\ = \sum_{i = 1}^{n} ln z_{i} + n lim_{β \to 0^{+}} \frac{\sum_{i = 1}^{n} (- z_{i}^{β} {ln}^{2} z_{i})}{\sum_{i = 1}^{n} (- z_{i}^{β} ln z_{i} - ln z_{i})} \\ = \sum_{i = 1}^{n} ln z_{i} + \frac{n}{2} \cdot \frac{\sum_{i = 1}^{n} {ln}^{2} z_{i}}{\sum_{i = 1}^{n} ln z_{i}} > 0 . \end{matrix}

On the other hand, assuming that

z_{1} > max {z_{2}, \dots, z_{n}}

, it follows:

\begin{matrix} lim_{β \to + \infty} L (β) & = & \sum_{i = 1}^{n} ln z_{i} + lim_{β \to + \infty} \frac{ln z_{1} + \sum_{i = 2}^{n} {(\frac{z_{i}}{z_{1}})}^{β} ln z_{i}}{z_{1}^{- β} - \frac{1}{n} - \frac{1}{n} \sum_{i = 2}^{n} {(\frac{z_{i}}{z_{1}})}^{β}} \\ = & \sum_{i = 1}^{n} ln z_{i} - n ln z_{1} < 0 . \end{matrix}

Hence, equation

L (β) = 0

has at least one solution, and it can be solved numerically, for instance, by using Newton-Raphson algorithm. This task may be performed using the function ’uniroot’ available in statistical programming software "R". Once

β

is estimated, this output can be used for estimating

α

For computing interval estimators for

θ = {(α, β)}^{'}

and testing hypotheses on these parameters, we get the observed matrix information:

I (θ) = - (\begin{matrix} \frac{\partial^{2} l (θ)}{\partial α^{2}} & \frac{\partial^{2} l (θ)}{\partial α \partial β} \\ \frac{\partial^{2} l (θ)}{\partial β \partial α} & \frac{\partial^{2} l (θ)}{\partial β^{2}} \end{matrix}),

where

\begin{matrix} \frac{\partial^{2} l (θ)}{\partial α^{2}} & = & - \frac{n}{α^{2}} \\ \frac{\partial^{2} l (θ)}{\partial α \partial β} & = & \frac{\partial^{2} l (θ)}{\partial β \partial α} = - \sum_{i = 1}^{n} {(\frac{1 + x_{i}}{1 - x_{i}})}^{β} ln (\frac{1 + x_{i}}{1 - x_{i}}) \\ \frac{\partial^{2} l (θ)}{\partial β^{2}} & = & - \frac{n}{β^{2}} - α \sum_{i = 1}^{n} {(\frac{1 + x_{i}}{1 - x_{i}})}^{β} {ln}^{2} (\frac{1 + x_{i}}{1 - x_{i}}) . \end{matrix}

Note that

I (\hat{θ})

is a consistent estimator of the expected Fisher information matrix

E [JI (θ)]

(see, e.g., [28]). Under some suitable conditions, the approximation to a normal distribution

\hat{θ} \approx N (θ, I {(\hat{θ})}^{- 1})

holds, and more general

a^{'} \hat{θ} \approx N (a^{'} θ, a^{'} I {(\hat{θ})}^{- 1} a),

for any vector

a = {(a_{1}, a_{2})}^{'}

. Choosing

a = {(1, 1)}^{'}

, we get the

100 \times (1 - δ) %

confidence interval:

θ_{i} \pm z_{δ / 2} \sqrt{{(I {(\hat{θ})}^{- 1})}_{i i}},

where

0 < δ < 1

and

z_{δ / 2}

is the

1 - δ / 2

quantile of the standard normal distribution.

4. Model Compatibility and Its Application to the Real-World Data

Here, the possibility of applying the UED model in terms of modelling empirical distributions of some real-world processes is discussed in more detail. To that end, by using several typical statistical indicators, the quality of fitting with the UED was additionally checked. The obtained results were also compared with the results of fitting using some of the previously known unit interval probability distributions, which additionally checked the possibility of applying the UED.

4.1. Measures of Goodness-of-Fit

In order to test the null hypothesis

H_{0} : F_{n} (x) = F_{0} (x)

, where

F_{n} (x)

is the empirical CDF and

F_{0} (x)

is the CDF of some specified (theoretical) distribution, usually some well-known statistical tests are used. In order to test the hypothesis that some real-world data are taken from the UED, that is from some other stochastic distribution, the following statistical tests are used here:

Kolmogorov Smirnov (KS) test, whose test-statistics is defined as:

$KS = max_{1 \leq i \leq k} \{\frac{i}{k} - z_{i}, z_{i} - \frac{i - 1}{k}\},$

where k denotes the number of classes and $z_{i}$ are the values of the theoretical CDF.
Anderson–Darling (AD $_{0}^{*}$ )-test usually attaches more mass to the distributions tails, and its test-statistics is:

$A_{0}^{*} = (\frac{2.25}{k^{2}} + \frac{0.75}{k} + 1) \{- k - \frac{1}{k} \sum_{i = 1}^{k} (2 i - 1) ln (z_{i} (1 - z_{k - i + 1}))\} .$
Cramér–von Mises (CVM $_{0}^{*}$ )-test is derived version of the KS test, with test-statistics: defined as

$W_{0}^{*} = \sum_{i = 1}^{K} {(z_{i} - \frac{2 i - 1}{2 k})}^{2} + \frac{1}{12 k} .$

Additionally, in order to check the quality of fitting certain real-world data using the UED, that is, some other distribution, the following indicators were used:

Akaike information criterion (AIC), defined as

$AIC = 2 m - 2 ℓ (\hat{Θ}),$

where m denote the number of parameters.
Corrected Akaike information criterion (AICc), expressed as

$AICc = AIC + \frac{2 m (m + 1)}{n - m - 1} .$
Bayesian information criterion (BIC), which is defined as

$BIC = m ln (n) - 2 ℓ (\hat{Θ}) .$
Hannan-Quinn information criterion (HQIC) expressed as

$HQIC = - 2 ℓ (\hat{Θ}) + 2 m ln (ln (m)) .$
Consistent Akaike information criterion (CAIC) given as

$CAIC = - 2 ℓ (\hat{Θ}) + m (ln (n) + 1) .$
Vuong test is also used for model selection purposes.

For comprehensive details about these measures readers are referred to Akaike [29], Hussain et al. [30], Murthy et al. [31], and Vuong [32], respectively.

4.2. Comparative Models

We also compare the proposed UED model with well known unit interval models, defined by the following PDFs:

the beta distribution (BD) [4]:

$\begin{matrix} f_{α}^{BD} (x) & = \frac{1}{B (α, β)} x^{β - 1} {(1 - x)}^{α - 1}, α, β > 0, 0 < x < 1, \end{matrix}$
the Johnson $S_{B}$ distribution (JSBD) [6]:

$\begin{matrix} f_{α, β}^{JSBD} (x) & = \frac{β exp [- \frac{1}{2} {(α + β ln (\frac{x}{1 - x}))}^{2} - β x]}{\sqrt{2 π} x (1 - x)}, α, β > 0, 0 < x < 1, \end{matrix}$
the Kumaraswamy distribution (KwD) [8]:

$\begin{matrix} f_{α, β}^{KwD} (x) & = α β x^{α - 1} {(1 - x^{α})}^{β - 1}, α, β > 0, 0 < x < 1, \end{matrix}$
the unit Gompertz distribution (UGmD) [15]:

$\begin{matrix} f_{α, β}^{UGmD} (x) & = α β x^{- α - 1} e^{- β (x^{- α} - 1)}, α, β > 0, 0 < x < 1 . \end{matrix}$

In order to compare the fitting results, we consider four different real-world datasets, classified into two sections: i) environmental and ii) engineering. The results obtained from the statistical analysis of these datasets are discussed below.

4.3. Environmental Datasets

Datasets I and II. The first two datasets are reported by Maiti [33], and they represent the following measured values:

- Soil moisture (Dataset I): 0.0179, 0.0798, 0.0959, 0.0444, 0.0938, 0.0443, 0.0917, 0.0882, 0.0439, 0.049, 0.0774, 0.0171, 0.0305, 0.0757, 0.0468,
- Permanent wilting points-PWP (Dataset II): 0.0821, 0.0561, 0.0202, 0.051, 0.0041, 0.0226, 0.0556, 0.0829, 0.0062, 0.0695, 0.0557, 0.0243, 0.0083, 0.0532, 0.0118.

In this regard, we have compiled both descriptive and theoretical (UED) statistics, listed in Tables-1 and -2, respectively. Note that descriptive statistics of all data sets includes sample size (SS), mean, median, standard deviations (SD), skewness (SK) and kurtosis (KU).

In addition, the total test time (TTT) plot, introduced in [34], is portrayed in Figure 3 for both datasets. Notice that, in particular, the TTT plot indicates the empirical HRF, portraying an IFR. Tables 1 and 2 also reveal that the theoretical UED statistics as well as the observed descriptive statistics show remarkable closeness to each other and it appears that both sets of data can be simulated by the proposed model. Furthermore, it is evident from Figure 4 that both data sets do not contain any outliers.

Table 1. Descriptive statistics for Datasets I and II.

Dataset	SS	Mean	Median	SD	SK	KU
I	15	0.0598	0.0490	0.0277	-0.1083	1.6247
II	15	0.0402	0.0510	0.0277	0.1083	1.6247

Table 2. Theoretical statistics from the UED.

Dataset	SS	Mean	Median	SD	SK	KU
I	15	0.0606	0.0621	0.0254	-0.2107	2.3825
II	15	0.0406	0.0384	0.0247	0.2942	2.3050

Table 3(a) portrays that the model proposed by the UED is the best strategy for analyzing the observed Dataset I, in relation to all other distributions of unit intervals. Namely, although the p-value of the KS statistics for KwD is the highest, the other non-parametric tests, CVM

_{0}^{*}

and AD

_{0}^{*}

, indicate that for the UED is obtained a minimum tested values. Also, based on the estimated values of Vuong statistics, given in Table 5, the KwD and UED has an indecisive status. Thus, the UED is the best strategy, which is also confirmed by Figure 5. Similarly, Table 3(b) portrays that the proposed UED model is also one of the best strategy for the analysis Dataset II, from all aspects.Namely, the test statistics, including KS, CVM

_{0}^{*}

and AD

_{0}^{*}

, have the lowest values compared to all the selected, previously known interval models. In addition, the Vuong statistic, which compares models based on the likelihood ratio phenomenon, openly supports the UED. Finally, Figure 5 also confirms our claim that the UED is the best strategy. Moreover, Tables-4(a) and -4(b) yield least values of information criterion values for the UED comparing to the competing models.

Table 3. (a): ML estimates and goodness-of-fit statistics for Dataset I, (b): MLEs and goodness-of-fit statistics for Dataset II.

(a)
Distribution	$\hat{β}$	$\hat{α}$	CVM $_{0}^{*}$	AD $_{0}^{*}$	KS	p-value
UED	18.4218	0.0773	0.6239	0.1026	0.2079	0.5361
BD	3.8233	60.2492	0.6858	0.1041	0.2099	0.5232
KwD	719.3842	2.4408	0.6887	0.1109	0.2003	0.5844
JSBD	4.9859	1.7279	0.7751	0.1117	0.2128	0.5056
UGoMD	1.6525	0.0048	1.0587	0.1613	0.2353	0.3769
(b)
Distribution	$\hat{β}$	$\hat{α}$	CVM $_{0}^{*}$	AD $_{0}^{*}$	KS	p-value
UED	11.8676	0.4607	0.6239	0.1096	0.1960	0.6118
BD	1.5370	36.8071	0.6869	0.1199	0.2481	0.3142
KwD	78.9162	1.4011	0.7074	0.1224	0.2409	0.3487
JSBD	3.5837	1.0177	0.8112	0.1364	0.2619	0.2549
UGoMD	0.9497	0.0219	0.9011	0.1499	0.2386	0.3603

Table 4. (a): Estimates of the maximum log-likelihood and information criteria for Dataset I, (b): Estimates of the maximum log-likelihood and information criteria for Dataset II.

(a)
Distribution	$- l$	AIC	AICC	BIC	HQIC	CAIC
UED	33.8617	-63.7233	-62.7233	-62.3072	-63.7384	-60.3072
BD	32.8026	-61.6052	-60.6052	-60.1891	-61.6203	-58.1891
KwD	33.3796	-62.7592	-61.7592	-61.3431	-62.7743	-59.3431
JSBD	32.0631	-60.1262	-59.1262	-58.7101	-60.1413	-56.7101
UGoMD	29.6463	-55.2925	-54.2925	-53.8764	-55.3076	-51.8764
(b)
Distribution	$- l$	AIC	AICC	BIC	HQIC	CAIC
UED	35.2604	-66.5208	-65.5208	-65.1047	-66.5359	-63.1047
BD	34.1097	-64.2194	-63.2194	-62.8033	-64.2345	-60.8033
KwD	34.3392	-64.6784	-63.6784	-63.2623	-64.6935	-61.2623
JSBD	33.0448	-62.0896	-61.0896	-60.6735	-62.1047	-58.6735
UGoMD	31.1648	-58.3296	-57.3296	-56.9135	-58.3447	-54.9135

Table 5. Vuong test statistics for Datasets I and II.

Models	Dataset I	Suitability	Dataset II	Suitability
UED-BD	1.4601	UED	2.5935	UED
UED-KwD	0.9738	Indecisive	3.4585	UED
UED-JSBD	1.5427	UED	1.6793	UED
UED-UGoMD	2.2142	UED	1.5955	UED

4.4. Engineering Datasets

Datasets III and IV. The third and fourth datasets have been firstly introduced and studied in [35] for Burr measurements on the iron sheets. For the third dataset of 50 observations on Burr (in the unit of millimetres), the hole diameter is 12 mm and the sheet thickness is 3.15 mm. For the fourth dataset of 50 observations, hole diameter and sheet thickness are 9 mm and 2 mm, respectively. Hole diameter readings are taken on jobs with respect to one hole, selected and fixed as per a predetermined orientation. These two datasets refer to two different machines being compared, and one can see [35] on the technical details of measuring the data sets. Note that both data sets were also analyzed in [36,37,38], and [19]. The descriptive statistics of these datasets, as well as the corresponding theoretical statistics for the UED, are presented in the following Tables 6 and 7, respectively. The TTT plot and box-plots of the observed data are given in Figure 6 and Figure 7, respectively. It can be observed that Dataset-III & -IV are positively skewed and platykurtic in nature, which is confirmed by Tables 6 and 7. In addition, from Figure 7 is evident that the empirical and theoretical aspects of these datasets, in terms of the absence of outliers, are in close agreement and indicate that the proposed model can be used effectively. Such findings are also consolidated within Table 8(a) and 8(b), which show that the UED exhibits minimal values in the almost all cases of goodness-of-fit statistics, which ensure that the UED is one of the best strategy.

Table 6. Descriptive statistics for Datasets III and IV.

Dataset	SS	Mean	Median	SD	SK	KU
III	50	0.1632	0.1600	0.0810	0.0723	2.2166
IV	50	0.1520	0.1600	0.0785	0.0061	2.3012

Table 7. Theoretical statistics from the UED.

dataset	SS	Mean	Median	SD	SK.	KU.
III	50	0.1633	0.1641	0.0809	0.0259	2.2511
IV	50	0.1519	0.1521	0.0777	0.0262	2.2521

Table 8. (a): MLEs and goodness-of-fit statistics for Dataset III, (b): MLEs and goodness-of-fit statistics for Dataset IV

(a)
Distribution	$\hat{β}$	$\hat{α}$	CVM $_{0}^{*}$	AD $_{0}^{*}$	KS	p-value
UED	4.7879	0.1756	0.3274	0.0419	0.1242	0.9881
BD	2.6824	13.8640	0.1538	0.9120	0.1414	0.5555
KwD	1.0746	0.0925	12.2879	2.3943	0.7222	0.0000
JSBD	2.3767	1.3175	0.2495	1.4647	0.1740	0.0968
UGoMD	0.0924	1.0747	0.5213	3.0810	0.2046	0.0304
(b)
Distribution	$\hat{β}$	$\hat{α}$	CVM $_{0}^{*}$	AD $_{0}^{*}$	KS	p-value
UED	4.8518	0.1996	0.3224	0.0339	0.1239	0.9928
BD	2.4003	13.5218	0.2871	1.5649	0.1981	0.7340
KwD	1.9606	31.3769	0.2093	1.2683	0.1691	0.8825
JSBD	2.3682	1.2374	0.4145	2.2458	0.2285	0.5579
UGoMD	0.0916	1.0250	0.6091	3.4278	0.2312	0.5426

However, likelihood aspects and information criterion values also favour the proposed UED model, which can be visualized in Tables 9(a) and 9(b), respectively. Furthermore, the shape of our proposed model, as shown in Figure 8, matches the data in a better way compared to the other competing models. Finally, Vuong statistics as depicted in Table-10 also show the capability of the proposed model.

Table 9. (a): Estimates of the maximum log-likelihood and information criteria for Dataset III, (b): Estimates of the maximum log-likelihood and information criteria for Dataset IV.

(a)
Distribution	$- l$	AIC	AICC	BIC	HQIC	CAIC
UED	-57.0712	-110.142	-109.887	-106.318	-108.686	-104.318
BD	-54.6066	-105.213	-104.958	-101.389	-103.757	-99.3892
KwD	-56.0686	-108.137	-107.882	-104.313	-106.681	-102.313
JSBD	- 51.3231	-98.6462	-98.3909	-94.8222	-97.19	-92.8222
UGoMD	-40.672	-77.344	-77.0887	-73.52	-75.8878	-71.52
(b)
Distribution	$- l$	AIC	AICC	BIC	HQIC	CAIC
UED	-59.3536	-114.707	-114.452	-110.883	-113.251	-108.883
BD	-55.9312	-107.862	-107.607	-104.038	-106.406	-102.038
KwD	-57.5214	-111.043	-110.788	-107.219	-109.587	-105.219
JSBD	- 52.305	-100.61	-100.355	-96.786	-99.1538	-94.786
UGoMD	-42.6099	-81.2198	-80.9645	-77.3957	-79.7636	-75.3957

Table 10. Vuong test statistics for Datasets III and IV.

Models	Dataset III	Suitability	Dataset IV	Suitability
UED-BD	0.4137	Indecisive	3.5339	UED
UED-KwD	-2.3203	KwD	3.9633	UED
UED-JSBD	2.1336	UED	3.4202	UED
UED-UGoMD	4.9679	UED	4.0306	UED

5. Concluding Remarks

We introduced a two-parameter bounded model, which is called as the unit exponential distribution (UED), which is appropriate for modeling skewed and IFR data. Some of its mathematical properties are studied, including moments, quantiles, and other distributional behaviour. A characterization of the UED via HRF is made, which provided the identification requirements of the distribution and thus provided a reliable prediction compared to the well-known unit domain models. The model parameters are estimated by the MLE method. We also provide a guide line to choose the best model by using various goodness-of-fit statistics. Applications of the newly defined distribution exhibits that the proposed models have better modeling abilities than competitive models. For this purpose we have used four datasets in two different disciplines, namely environmental and engineering, and it is found that the proposed strategy, is the best one on unit interval domain.

Author Contributions

Conceptualization, H.B. and T.H.; methodology, H.B., M.T. and N.Q.; software, H.B. and T.H; validation, H.B., M.T. and V.S.; formal analysis, H.B., T.H. and V.S.; data curation, H.B. and T.H.; writing—original draft preparation, H.B, T.H. and V.S.; writing—review and editing, M.T., V.S. and N.Q.; visualization, T.H. and M.T.; supervision, H.B and V.S.; project administration, M.T. and N.Q. All authors have read and agreed to the published version of the manuscript.

Acknowledgments

The authors gratefully acknowledge Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2023R376), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia for the financial support for this project.

Conflicts of Interest

The authors declare no conflict of interest.

References

Fleiss, J.L.; Levin, B.; Paik, M.C. Statistical methods for rates and proportions. 3rd edition, John Wiley & Sons Inc., 1993.
Gilchrist, W. Statistical Modelling with Quantile Functions. CRC Press, Abingdon, 2000.
Seber, G. A. F. Statistical Models for Proportions and Probabilities, Springer, 2013.
Bayes, T. An Essay Towards Solving a Problem in the Doctrine of Chances. By the late Rev. Mr. Bayes, F. R. S. communicated by Mr. Price, in a letter to John Canton, A. M. F. R. SPhil. Trans. R. Soc., 1763, 53, 370–418. [Google Scholar] [CrossRef]
Leipnik, R. B. Distribution of the Serial Correlation Coefficient in a Circularly Correlated Universe. Ann. Math. Stat. 1947, 18(1), 80–87. [Google Scholar] [CrossRef]
Johnson, N. Systems of Frequency Curves Derived From the First Law of Laplace. Trabajos de Estadistica 1955, 5, 283–291. [Google Scholar] [CrossRef]
Jørgensen, B. Proper Dispersion Models. Braz. J. Probab. & Stat. 1997, 11, 89–128. [Google Scholar]
Kumaraswamy, P. A Generalized Probability Density Function for Double-Bounded Random Processes. J. Hydrol. 1980, 46, 79–88. [Google Scholar] [CrossRef]
Topp, C. W.; Leone, F. C. A Family of J-Shaped Frequency Functions. J. Amer. Statistical Assoc. 1955, 50(269), 209–219. [Google Scholar] [CrossRef]
Consul, P. C.; Jain, G. C. On the Log-Gamma Distribution and Its Properties. Statistische Hefte 1971, 12, 100–106. [Google Scholar] [CrossRef]
Smithson, M.; Shou, Y. CDF-Quantile Distributions for Modelling RVs on the Unit Interval. British J. Math. Stat. Psych. 2017, 70(3), 412–438. [Google Scholar] [CrossRef]
Nakamura, L. R.; Cerqueira, P. H. R.; Ramires, T. G.; Pescim, R. R.; Rigby, R. A.; Stasinopoulos, D. M. A New Continuous Distribution on the Unit Interval Applied to Modelling the Points Ratio of Football Teams. J. Appl. Stat. 2019, 46, 416–431. [Google Scholar] [CrossRef]
Ghitany, M. E.; Mazucheli, J.; Menezes, A. F. B.; Alqallaf, F. The Unit-Inverse Gaussian Distribution: A New Alternative to Two-Parameter Distributions on the Unit Interval. Commun. Stat. - Theory Methods, 2019, 48, 3423–3438. [Google Scholar] [CrossRef]
Altun, E.; Hamedani, G. The Log-Xgamma Distribution With Inference and Application. Journal de la Société Française de Statistique 2018, 159, 40–55. [Google Scholar]
Mazucheli, J.; Menezes, A. F.; Dey, S. Unit-Gompertz Distribution with Applications. Statistica 2019, 79, 25–43. [Google Scholar] [CrossRef]
Mazucheli, J.; Menezes, A. F. B.; Chakraborty, S. On the One Parameter Unit-Lindley Distribution and Its Associated Regression Model for Proportion Data. J. Appl. Stat. 46(4), 700–714. [CrossRef]
Mazucheli, J.; Menezes, A. F. B.; Fernandes, L. B.; de Oliveira, R. P.; Ghitany, M. E. The Unit-Weibull Distribution as an Alternative to the Kumaraswamy Distribution for the Modeling of Quantiles Conditional on Covariates. J. Appl. Stat. 47(6), 954–974. [CrossRef] [PubMed]
Altun, E. The Log-Weighted Exponential Regression Model: Alternative to the Beta Regression Model. Commun. Stat. - Theory and Methods 2020, 50, 2306–2321. [Google Scholar] [CrossRef]
Gündüz, S.; Mustafa, Ç.; Korkmaz, M.C. A New Unit Distribution Based on the Unbounded Johnson Distribution Rule: The Unit Johnson SU Distribution. Pak. J. Stat. Oper. Res. 2020, 16(3), 471–490. [Google Scholar] [CrossRef]
Korkmaz, M. Ç.; Korkmaz, Z. S. The Unit Log–log Distribution: A New Unit Distribution With Alternative Quantile Regression Modeling and Educational Measurements Applications. J. Appl. Stat. 2023, 50(4), 889–908. [Google Scholar] [CrossRef]
Afify, A.Z.; Nassar, M.; Kumar, D.; Cordeiro, G.M. A new unit distribution: Properties and applications. Electron. J. Appl. Stat., 2022, 15, 460–484. [Google Scholar]
Dombi, J.; Jónás, T.; Tóth Z., E. The Epsilon Probability Distribution and its Application in Reliability Theory. Acta Polytech. Hung. 2018, 15, 197–216. [Google Scholar]
Artzner, P.; Delbaen, F.; Eber, J.-M.; Heath, D. Coherent Measures of Risk. Math. Finan. 1999, 9, 203–228. [Google Scholar] [CrossRef]
Ahsanullah, M.; Shakil, M.; Kibria, B.M.G. Characterizations of Continuous Distributions by Truncated Moment. J. Modern Appl. Statist. Methods 2016, 15, 316–331. [Google Scholar] [CrossRef]
Ahsanullah, M.; Ghitany, M. E.; Al-Mutairi, D. K. Characterization of Lindley Distribution by Truncated Moments. Commun. Stat. - Theory and Methods 2017, 46, 6222–6227. [Google Scholar] [CrossRef]
Hamedani, G.G. Characterizations of Univariate Continuous Distributions Based on Truncated Moments of Functions of Order Statistics. Studia Scientiarum Mathematicarum Hungarica, 2010, 47, 462–468. [Google Scholar] [CrossRef]
lánzel, W. A Characterization Theorem Based on Truncated Moments and Its Application to Some Distribution Families. In: P. Bauer, F. Konecny, W. Wertz (Eds.), Mathematical Statistics and Probability Vol. B, D. Reidel Publishing Company Dordrecht-Holland, 75-–84, 1987.
Lindsay, B.G.; Li, B. On Second-Order Optimality of the Observed Fisher Information. Ann. Stat. 1997, 25, 2172–2199. [Google Scholar] [CrossRef]
Akaike, H. A New Look at the Statistical Model Identification. IEEE Transactions on Automatic Control 1974, 9, 716–723. [Google Scholar] [CrossRef]
Hussain, T.; Bakouch, H.S.; Chesneau, C. A New Probability Model With Application to Heavy-Tailed Hydrological Data. Environ. Ecol. Stat. 2019, 26, 127–151. [Google Scholar] [CrossRef]
Mustafa, Ç.; Korkmaz, Z.; Korkmaz, S. The Unit Log–log Distribution: A New Unit Distribution With Alternative Quantile Regression Modeling and Educational Measurements Applications. J. Appl. Stat. 2023, 50, 889–908. [Google Scholar] [CrossRef]
Vuong, Q. H. Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses. Econometrica 1989, 57, 307–333. [Google Scholar] [CrossRef]
Maity, R. Statistical Methods in Hydrology and Hydroclimatology. Springer Nature Singapore Pte Ltd., Singapore, 2018.
Aarset M., V. How to Identify a Bathtub Hazard Rate. IEEE Transactions on Reliability 1987, 36, 106–108. [Google Scholar] [CrossRef]
Dasgupta, R. On the Distribution of Burr With Applications. Sankhya B 2011, 73, 1–19. [Google Scholar] [CrossRef]
Dey, S.; Mazucheli, J.; and Anis, M. Estimation of Reliability of Multicomponent Stress–strength for a Kumaraswamy Distribution. Commun. Stat. - Theory and Methods 2017, 46, 1560–1572. [Google Scholar] [CrossRef]
Dey, S.; Mazucheli, J.; Nadarajah, S. Kumaraswamy Distribution: Different Methods of Estimation. Comput. Appl. Math. 2018, 37(2), 2094–2111. [Google Scholar] [CrossRef]
ZeinEldin, R. A.; Chesneau, C.; Jamal, F.; Elgarhy, M. Different Estimation Methods for Type I Half-Logistic Topp–Leone Distribution. Mathematics 2019, 7, 985. [Google Scholar] [CrossRef]

Figure 1. Plots of the PDFs of the UED by varying parameters.

Figure 2. Plots of the HRFs of the UED by varying parameters.

Figure 3. TTT plots of Datasets I and II.

Figure 4. Box-plots for datasets I and II.

Figure 5. Datasets I and II (given by histograms) fitted via unit interval distributions (given by lines).

Figure 6. TTT plots of Datasets III and IV.

Figure 7. Box-plots for Datasets III and IV.

Figure 8. Datasets III and IV (given by histograms) fitted via unit interval distributions (given by lines).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

MDPI Initiatives

Important Links

Choose an area of interest and we will send you notifications of new preprints at your preferred frequency.

Disclaimer

Unit Exponential Probability Distribution: Characterization and Applications in Environmental and Engineering Data Modelling

Abstract

1. Introduction

2. The Proposed Unit Exponential Distribution (UED)

2.1. Properties of the Model

2.1.1. Quantile

2.1.2. Mode

2.1.3. Behaviour of the PDF at x → 0 + and x → 1 −

2.1.4. Moments

2.1.5. Failure (Hazard) Rate Function

2.2. Characterizations

3. Estimation Procedure

4. Model Compatibility and Its Application to the Real-World Data

4.1. Measures of Goodness-of-Fit

4.2. Comparative Models

4.3. Environmental Datasets

4.4. Engineering Datasets

5. Concluding Remarks

Author Contributions

Acknowledgments

Conflicts of Interest

References

MDPI Initiatives

Important Links

Subscribe

2.1.3. Behaviour of the PDF at $x \to 0^{+}$ and $x \to 1^{-}$