1. Introduction
All everyday human activities give rise to signals that carry a certain type of information about the systems that generated them. These signals are bounded functions that are collected to be studied, transmitted and manipulated in order to extract the information they carry. Discrete-Time Signal Processing (DTSP), also called Digital Signal Processing, is a set of mathematical and engineering techniques that allow the processing (collection, study, analysis, synthesis, transformation, storage, etc.) of signals performed mainly on digital devices.
Combining ideas, theories, algorithms and technologies from different quadrants, the DTSP has not stopped continuously evolving and increasing its already vast field of applications. This evolution was motivated by the enormous progress of digital technologies that allow the construction of processors, in general, more reliable and robust than analog ones and, above all, more flexible. The on-chip implementation of specialized processors (e.g., FFT) has facilitated the application of mathematical techniques that would be difficult (or impossible) to perform analogically. The DTSP plays an important role in communication systems where its mission is to handle signals, both at transmission and reception, in order to achieve an efficient and reliable flow of information between source and receiver. However, it is not only in communication systems that we find DTSP applications. In fact, its field of action has widened and includes areas such as: Speech, Radar and Sonar, Seismology, Bio-medicine, Economics, Astronomy, etc. In Mathematics, it has been very useful in the study of functions and in solving differential equations. The well-known Newton series is very famous.
Mathematically, the DTSP relied on several important tools such as real and complex analysis, difference equations, discrete-time Fourier and Z transforms, algebra, etc. It benefited from the enormous development of signal theory in the 2nd half of the 20th century when signal processing techniques reached a sufficiently high degree of development. However, its origins are much earlier.
In general, we can "date" the beginning of the study of signals to the discovery of periodic phenomena that led to the introduction of the notions of year, week, day, hour, etc. With an equal degree of importance, we can consider the theory and representation of music made by the Pythagoreans as the first spectral analysis. It is important to note that they actually made a discrete time-frequency formulation. More recently, we refer the discovery and study of the spectrum of sunlight by Newton (1666) and the works of mathematicians such as Euler, Gauss (who devised the first algorithm for the fast Fourier transform – 1805), Fourier (who created the basis for spectral analysis), Sturm and Liouville. These works had direct implications on the way of studying signals in the frequency domain, which did not cease to evolve and gain importance from the 1940s, thanks to the works of the theoretical field of stochastic processes (Wiener and Kolmogorov): correlation, adapted filter, Wiener filter, etc [
1,
2], notions that would become the basis of modern developments in Spectral Analysis (Tukey, Parzen, Akaike, Papoulis, and Burg). It was also Tukey who, with Cooley, rediscovered the algorithm that allowed the implementation of the FFT in 1965, which was a milestone in signal analysis.
The difference equations, taking the form of the ARMA (autoregressive-moving average) model, had a rapid increase in importance due to the works of Box, Jenkins, Oppenheim, Kailath, Kalman, Robinson, Rabiner, and many others in the 1980s of the 20th century [
3,
4,
5,
6,
7,
8]. We can place here the real beginning and affirmation of DTSP. Nevertheless, the discovery of computers was perhaps the biggest impulse given to the DTSP, by the possibility of discrete implementation of processor devices, previously made exclusively with analog technology and to perform simulations that allow to predict, with great accuracy, the behavior of a given system. This led to an autonomization of the theory of “Discrete Signals and Systems” that became an independent branch, leading to alternative technological solutions based on digital design and realization devices [
3,
4,
8,
9,
10,
11,
12,
13,
14,
15]. Although the main developments were based on difference equations, the true origins were not forgotten and motivated some attempts to model and identify systems based on the delta difference [
16,
17,
18,
19,
20].
The emergence of fractional tools has opened new doors to the modeling and estimation of everyday systems that were known to be best described by fractional systems. However, this does not mean that there was a coherent theory of fractional systems in discrete time. Probably the first attempt was made in [
21], but the systems described are not really fractional, although they use fractional delays. In the last 20 years, many texts have been published on fractional differences and derivatives in discrete time, leading to different views of what fractional systems in discrete time are and how they are characterized [
22,
23,
24,
25,
26,
27,
28]. The purpose of this paper is exactly to describe the mathematical basis underlying the main formulations. We introduce differences through a system approach to highlight the fact that the required definition must be valid for any function regardless of its support. This allows for a broader scope. On the other hand, it is important to make a clear distinction between time flow from left to right (causal case) or the other way around (anti-causal). Under normal conditions, they should not be mixed. To this end, we define "time sequence" as an alternative to "time scale", avoiding the confusion that the latter might introduce. We will proceed with the definitions of nabla (causal) and delta (anti-causal) differences and enumerate their main properties. We proceed with the introduction of other formulations, such as discrete-time, bilateral, tempered differences, and the completely new bilinear differences. These differences are “invariant under translations”. We propose new “scale-invariant differences” that are connected to Hadamard derivatives. For all the presented differences, ARMA type difference equations are proposed.
Given the importance of discrete signals inherent in this work, we review the classical sampling theorem valid for the case of shift-invariant systems [
14,
29,
30,
31,
32] and another one suitable for scale-invariant systems, but different from similar existing in the literature [
33,
34].
The paper outlines as follows. In
Section 2 we present several mathematical tools useful in the paper and clarify some notions. The sampling theorems are introduced here.
Section 3 is used to make a historical overview of the difference evolutions, both continuous and discrete -time. The different approaches are described. The problems created by some definitions are criticized in
Section 4. The main contributions in this paper are presented in
Section 5 where several shift-invariant differencers and accumulators, say: nabla, delta, two-sided, tempered, and bilinear differences. For all the definitions, continuous-time and discrete-time versions are presented. The scale-invariant differences are introduced and studied in
Section 6. All the described differences are suitable for defining ARMA type linear systems. This is done and exemplified in
Section 7. Finally we present a brief discusion.
6. Scale–Invariant Differences
In previous sections, we delt preferably with the shift-invariant differences. Here, we will consider other that have deep relations with the scale-invariant derivatives [
36].
Consider two bounded piecewise continuous functions with For simplicity, assume they are of polynomial order, so that we assure they have Mellin transforms, , analytic over suitable ROC.
Definition 11.
Let . We define stretching differencer as a linear system whose output, at a given scale, is the difference between the input at different scales:
where q is the scale constant.
The output will be called stretching difference and represented by . Letting , we obtain the shrinking difference, . Therefore, their properties are similar. We will study the first only, because the other is easy to obtain. From this definition we can draw some conclusions:
- 1.
Its impulse response is given by:
so that
implying that
- 2.
The transfer function is
having
as ROC.
- 3.
As in the shift-invariant case, if associate in series
n systems the resulting system defines the
n–th order stretching difference that has a transfer function given by
from where we obtain the
n–th order stretching difference
Now, let us return back to (
40) and invert the role of the functions: assume that the input is
and the output is
that can be reused to give:
It is not hard to see that
with MT
Relation (
99) shows why this operator is again an accumulator. The series association of
n equal accumulators gives the
n–th order stretching sum:
Definition 12.
This result together with (98) suggest that the α–order stretching differencer/accumulator must be given by
where α is any real number.
As we observe, this difference uses an exponential domain in agreement with our considerations above (
Section 2.3). The corresponding MT is
Therefore, the shrinking difference is defined by
having MT
Remark 6. The relations (100) and (105) show that there is a scale-invariant system that produces the difference as output.
To obtain the discrete-scale versions we only need to make a sampling in agreement with theorem 2. Let
. For the stretching difference we obtain
while for the shrinkning one it comes
As we can see, they are formally similar to the nabla and delta differences; they only differ in the sampling sequence used: linear or exponential. Thus, from a purely discrete point of view, we have no way of making any distinction between linearly or exponentially sampled functions. This means that if we want to define stretching and shrinking differences in discrete time, we have to break the continuous connection: we have to work exclusively with sequences defined in
. So, having a sequence
, we wonder how to define stretched and shrunk sequences so that we can introduce differences. In [
111] ways to produce stretched and shrunk sequences were presented. In principle, we could use them to define differences, but this procedure has difficulties, since the operations of stretching and shrinking involve all knowledge of the underlying sequence and the scale transformation system is two-sided.
However, we can use the traditional “decimation” operation used in signal processing to define a stretching difference [
5,
8,
14].
Definition 13.
Let N be a positive integer (decimation parameter). We define a stretching difference through
We can show immediately that, if
M is a positive integer, then
If
, then
Proceeding as above, we obtain
that allows us to write
Definition 14.
These relations suggest we write
This last relation seems pointing to a definition of a “discrete Mellin transform,” as
which is different from other proposed in the past [
33,
34]. We do not go further in this way.
The properties of the stretching discrete difference just proposed are readily deduced from the results in
Section 5.3. From the such a definition, it is evident the reason for not defining a shrinking difference.
8. Discussion
Differences are basic building blocks for defining derivatives, but they can be used in many applications, to solve differential equations and model many systems. In most situations, shift-invariant differences are used, although scale-invariant versions are also useful. Here they have been studied separately. General continuous-time cases have been introduced, although the main interest has been placed on the discrete versions. These are fundamental in computational applications. We adopted a system point of view to emphasize that differences are outputs of linear systems, which implies that they are defined independently of the inputs, notably their duration. Moreover, outputs are generally of infinite duration, even if the input support is finite. In addition to the classic differences, we introduce new ones such as bilateral and tempered differences.
The option by system approach for introducing differences allowed us to define ARMA type linear systems enlarging the classic procedure used in time series analysis and processing which supported many important applications in Engineering, Economics, Statistics, and so on. It is important to remark that many interesting functions we find in applications are acquired under a discrete-time form, without having any analytical formula. This implies that we have to deal with functions (signals) defined in the set of integers. Anyway, implicit in any application, there is a time sequence imposed by an underlying clock which imposes a working domain that we cannot change. This aspect was frequently dismissed in the past, a fact that led to some “abnormalities” like the lost of (anti-)causality. This happened, for example, with the assumed “delta difference” that is not really a delta difference, since it should be anti-causal, but it is bilateral. This fact is expected to make a review of some associated concepts and tools.