2.2. System of heterogeneous parallel servers
Next, we consider a multiclass or heterogeneous server system, with arrival rate
λ with Poisson distribution. We consider a 3 parallel server system with service rates
μ1, μ2, μ3, also with Poisson distribution. Parts or customers are serviced ordered entry. The state diagram of the system, Model A, is included in
Figure 1. Each server can only allocate 1 unit and the 3-tuple indicate which server is in service (1) or empty (0), read from right to left, like a binary number. Thus, (110) denotes empty (0) the server 1 and in service servers 2 and 3. The state (1,1,1) is the state of the full system.
The resolution in the stationary regime of the Markovian chains system can be expressed by their state equations in the form of the transition matrix (3) and by solving the homogeneous linear system of equations. It is an ordinary alternative replacing the last equation with the product form condition of probabilities sum the unit, (3).
Model results and system simulations are included in
Table 1, where
R*=R/Σμ.. In the case
μ1=μ2=μ3, it can be directly compared to the Erlang first model for
k=3 by (2). The throughput is calculated for Model A by applying the PASTA property, so
λeff =λ(1-p111)= λ(1-pf), where
pf is the probability of the maximum size or full system.
There are significant differences between the experimentally simulated throughput and the calculated by Model A which mostly overestimates the throughput when arrival rates are high (saturation). Note that in this conventional transition diagram that enumerates the states of the servers, the probability of the full system with all busy servers estimates the probability of loss. The transition to the full system (111) happens at the rate λ (birth) from the size 2 of the system ‒states (110), (011) or (101)‒ and the system leaves (death) the full state from size 3 to 2 at rate Σμ
We also consider the homogeneous system M/M/m/m with
m=3, so it is applicable the Erlang-B formula (2) for the calculation of the effective throughput
λeff=λ(1-p111)= λ(1-B(3,r)), see
Table 1. For the Erlang model, the results are completely in agreement with those from the simulation, thus the identical set of servers of M/M/m/m behaves with no difference with an ordered entry system, but since the servers are undistinguishable, the order becomes irrelevant and there would not be difference serving by ordered entry, random entry, or another service discipline [
31].
Next, we consider a new formulation of the transition diagram, Model B. Different from Model A, when the system is full, the lost offered load is redirected to a virtual server with a queue of infinite capacity, herein called loss server, where arrivals enter when the real servers of the system are occupied,
Figure 2. When arrivals are lost the busy system enters the state
>111= b.
The transition matrix is given by (4), and the mean number of customers L in the system or the utilization U is calculated by (5). Note that when the system falls into the loss server, state
>(111)=b, the 3-server subsystem is occupied, and the calculation of L in the servers by (5) is taken into account. In the transition diagram, the transition (birth) from the busy system (111) into the loss state happens at the rate λ, so the arrival goes into the loss server when servers are busy and the system leaves this loss state at the rate Σμ from size 3 to 2, as soon any server is available to serve, as represented.
The results of simulation in
Table 1 have been obtained with Arena software by Rockwell Inc.
λ=μi=1, also with a simulation run of 30 replications of 1000 hours, after 1000 hours warm-up for the stationary regime, where throughput is expressed dimensionless by divided into
Σμi. The effective throughput rate is calculated by
λeff=λ(1-pbusy). Where
pbusy is the percentage of time the 3-server system is busy. In the Erlang model of identical servers pbusy
pbusy =B(3,r) by (2), for Model A with 3 servers is
pbusy =p111 = pf from (3), and in the case of Model B, with 3 servers and the loss server, is
pbusy =p111+p>111= pf+pb from (4). Model A is a representation of a loss system that evaluates throughput from the offered load
λ. The direct calculation based on the PASTA property with the effective offered load
λeff=λ(1- pf), is only a rough approximation for heterogeneous parallel servers, as the experimental results in
Table 1 show for Model A.
The results indicate that the description of the system states and their probabilities should take into account not only the states of the servers but the associated allocation of all the arrivals and the different independent states in order to be considered a product form set of state probabilities of sum unit. In addition to the servers’ busy state (111), the lost customers enter the loss server. In the transitions diagram of Model B, the
>(111)=b is a state of full size for the servers. The system transits into the state
>(111) with the rate
λ when the arrivals overflow the servers, and the net input into the servers is
λeff , while through the loss server is
λ-λeff. Based on the PASTA property, the split of the Poisson rate for all the servers, including the loss server, remains Poisson [
1]. Based on [
36], the net output rate through the loss server, which includes a queue of infinite capacity, is also
λ-λeff, as providing the service rate of the loss server is greater than
λ-λeff. This loss server has infinite capacity and we can associate an arbitrarily sufficient high service rate
>(λ- λeff), complying with it. Finally, the total output of the whole system is
λ, the sum of the servers and loss server together.
The system transition into the loss server of probability
pb with rate
λ after the system is full (111), and the system leaves the state at the rate the sum of the service rates of the 3 servers, so as soon any server releases a customer after service the number of customers in the servers becomes m-1, as represented in the transition states diagram. Note that by including the loss server in Model B, the system is not a loss system anymore, so the calculation of the states associated with the offered rate is accurate. The probability of state
>(111)=b is the percentage of time that the arrival rate falls into the loss server, but this happens always after the servers are occupied. In Model B the PASTA property is applied to the 3 real servers defined in the transition diagram: the offered rate to the servers is
λ(1-p>111 )= λ(1-pb ) before the application of the PASTA property. After applying the PASTA property to the three servers input, the effective rate through the servers becomes
λeff=λ(1–p>111–p111)= λ(1–pb –pf), by discounting the percentage of time the servers are occupied
p111. The arrival net rate into the loss server is finally
λ( pb+pf), which is an accurate account of the time percentage that the total arrival rate λ (offered load) does not enter into the servers, verified in the experimental through the accurate results of
Table 1. There is a need of considering every single arrival in the probabilities that arises from the multi-dimensional queue nature of the ordered entry for heterogeneous systems. While Erlang systems of undistinguishable homogeneous servers do not require it.
In the timeline of the arrival rates, there are intervals of time when the servers are occupied and no arrivals trying to enter the system with probability
pf, and others where the arrivals try to enter and are rejected to the loss server with probability
pb. At both intervals, no entrance of customers into the server system is possible, so they are discounted to get the effective arrival rate into the servers. In Model B, the every arrival is included in the set of states for accurate accounting, so the system evaluates every customer destination. In general, an ordinary transition matrix that only enumerates the states of the servers, Model A, overestimates throughput (underestimates the probability of occupied servers) for high offered load by applying the PASTA property, see
Table 1. In synthesis, including the time percentage of the loss state (loss server) provides accurate throughput calculation of systems of finite capacity, Model B.
The evolution of the probabilities or percentage of time for Model B of full servers
pf, loss from blocking when servers are busy (blocking)
pb, and the system occupied pocc= pb+pf are represented in
Figure 3. In addition, the probability of an empty system p0 is represented in the results. While
λ/Σμ < 1, the probability of system busy
pb is lower than the probability of full servers
pf. When
λ/Σμ≈1 both are equally probable. It is remarkable that the probability of full servers pf continues to grow with increasing levels of arrival rate up to the point it reaches a maximum, at about
λ/Σμ≈1.75. After that maximum, when some extra load is offered, goes directly to the loss server, and
pf decreases while
pb continuously grows. The percentage of time the servers are occupied with no customers trying to enter the servers,
pf, decreases monotonically after that maximum, so for an infinite arrival rate, it is inferred that there is no chance of a full system without loss, so
pf → 0 while the probability of loss from blocking
pb continuous to grow. Thus, when an extremely high load is offered
λ→∞, loss from blocking probability
pb would become the only probable situation.
Also from the results of
Figure 3, at low
λ, the probability of empty system
p0 is higher than the probability of occupied system
pocc. At about
λ/Σμ≈0.6 both are equally probable and when
λ/Σμ >0.7 approx., the probability of empty system p0 becomes lower than the probability of occupied servers
pf. In an attempt to maximize the utilization of servers, increasing the arrival rate
λ over that point will increase the utilization. Nevertheless, if the increase of cost from losing a customer is higher than the reduction of cost by increasing utilization, operating the system over that point might be no convenient. Increasing the installed capacity to avoid customer loss instead of increasing server utilization might be a better option. Note that at
λ/Σμ≈0.6 the perceived saturation of servers is low, so only about 18% of the time the servers would be observed full,
pf, or empty,
p0. Nevertheless, judging the service performance of a loss system based on the observed utilization of the servers seems to be clearly misleading: for instance, at a high offered load of
λ/Σμ =4, less than 20% of the time the servers will be fully occupied pf, with the appearance of extra idle capacity, while in fact the system is saturated and the offered load loss pb is more than 60%.
Model B considers heterogeneous servers, whereas the Erlang model M/M/k/k just with
k identical servers is a particular case. In this case, the full system state can be described properly by
λeff=λ(1- pbusy), from (1) and (2), results included in
Table 1. The results of Model B for
(k=3, μ1=μ2=μ3=μ=1) of identical servers are also those of the Erlang-B formula. It results in
pbusy =
B(3,λ/μ) in
Figure 3, but it has also been verified through the numerical values identical to 15 decimal places.
The calculations based on simulation confirm the accuracy of Model B, M/Mi/k/k, to describe the heterogeneous parallel server systems. From the former results, it is valuable the comparison the overflow results of M/Mi/k calculated from the analytical results by Matsui and Fukuta (1977) [
21], and those got for the full system
pbusy= pb+pf from Model B,
Table 2. The probabilities are identical. Based on the independence of events of the exponential interarrival times, the results from the arrangement of arrivals in an infinite waiting queue or providing new arrivals after the loss are equivalent in terms of overflow or loss. Noteworthy, the M/Mi/k system has an infinite capacity queue in front of the servers. It is known that this system model is unstable when λ≥Σμ is unstable as the waiting line grows indefinitely. Meanwhile, the M/Mi/k/k by Model B redirects the overflow to the loss server without that limitation.