1. Introduction
The periodogram of a finite data sequence consists of the square absolute value of its Discrete-Time Fourier Transform (DTFT) and is one of the most versatile tools in Signal Processing. Just to recall a few of its uses, it is a well-known power spectral-density estimator and the basis of more elaborate estimators of the same type [
1], (Ch.6). It allows one to implement the Maximum Likelihood (ML) estimator of a single frequency and a good quality estimator for multiple well-separated frequencies [
2], (Sec.13.3.2), [
3,
4]. Besides, it is also applicable to delays and angles of arrivals, given that these parameter types can be turned into frequencies either by means of the Fourier transform or by exploiting a uniform and linear array geometry respectively [
5,
6]. Finally, it also provides channel impulse-response estimates [
7,
8].
The periodogram can be efficiently sampled using the Fast Fourier Transform (FFT), a fact that seems to be the main factor behind its popularity. Actually, this sampling just involves
operations, where
N is the data sequence length. There exists an extensive literature on methods for estimating multiple frequencies that complement this FFT sampling with other operations such as the computation of additional DTFT samples or the evaluation of interpolators, [
9,
10,
11,
12,
13,
14,
15,
16,
17]. These additional operations have complexity
where
K is the number of frequency estimates, except for the method in [
12,
15] in which their complexity is
; (see also [
18] for the interpolation method used in these last references).
In most cases, frequency estimation using the periodogram is done assuming well-separated pure components, i.e, the data is assumed to consist of as sum of
components in white noise. However, in channel estimation, this assumption is often inadequate, given that physical phenomena such as reflections on rough surfaces or multiple closely-spaced specular reflections produce diffuse or spread components that appear as a single peak in the periodogram [
8,
19,
20,
21,
22]. Intuitively, we may expect the shape of a given periodogram peak to reveal, in some way, whether it is produced by a diffuse or a pure component. In this paper, we specify this intuition by presenting a method to detect whether a given periodogram peak is diffuse and to estimate its “spreadness”, i.e, the degree to which it is diffuse. The method is based on analyzing the subspace spanned by the signature’s derivatives (up to a given order) at the peak’s frequency.
The paper has been organized as follows. In the next section, we present the signal models for a pure and a diffuse component, the periodogram function, and sketch the proposed method for detection and estimation of diffuse components. Then, in
Section 3, we derive the Gram-Schmidt basis associated with the frequency signature and its derivatives which is the main analytical tool in subsequent sections. After that, we obtain the statistical distribution of the projection of the data vector onto the span of the signature’s derivatives in
Section 4, and we present the proposed estimator and detector in
Section 5.
Section 6 is dedicated to an efficient computation method for the correlation samples required by the method which is based on barycentric interpolation. Finally, we assess the proposed detector and estimator in
Section 7 numerically.
1.1. Notation
The notation is the paper is the following:
We denote column vectors in bold and lower case (, ).
denotes the identity matrix.
denotes the nth component of vector .
and are the transpose and Hermitian of vector respectively.
New symbols or functions are introduced using the symbol “≡”.
is the discrete Dirac delta, i.e, and for integer .
denotes a column vector of zeroes whose length can be determined from the context.
For a function , denotes the total derivative of at the value , i.e, is the best linear approximation to at .
2. Signal Models for a Pure and a Diffuse Component. Sketch of the Proposed Method
Consider a
data vector
consisting of a single undamped exponential of frequency
in noise,
where
is a complex amplitude, and
an
complex zero-mean circularly-symmetric white noise vector of variance
. The Deterministic Maximum Likelihood (DML) estimate of
in (
1) is the abscissa
at which the global maximum of the periodogram function
is attained, that is,
(See [
3], (Sec3.B)). This frequency estimator is popular due to its statistical efficiency and because it can be efficiently computed through an FFT algorithm followed by a numerical method; [
9,
15].
(
1) is often a simplified model for a situation in which there is actually some frequency spread along a short band
. For instance, in channel sounding, the periodogram’s peak may be produced by an imperfect specular reflection, i.e, a reflection on an irregular surface, and a more realistic model would be
for a function
. Here,
is completely unknown and hardly estimable in practice. It can be, for example, a sum of
components, i.e,
for complex coefficients
and frequencies
,
; a continuous function in
; or any distribution we can think of in
.
Assume now that the periodogram has a significant peak at frequency
for which both (
1) and (
5) seem plausible. The question is whether there is a detector that allows us to select either (
1) or (
5). In this detector, (
1) would be hypothesis H0 and (
5) hypothesis H1. To sketch a possible detector, consider (
1) and (
5) without noise (
). In this case, the periodogram estimate is exact assuming (
1), that is
, and
lies in the span of
given that
However, assuming (
5),
turns out to be a frequency in
and
fails to lie in the span of
. But since
is short, a truncated Taylor expansion of small order
P such as
is accurate in
and, therefore,
approximately lies in the span of
,
, given that
Thus, if we decompose
in two components
where
lies in the span of
and
lies in the orthogonal complement of
relative to the span of
, then we have that a diffuse component can be detected by a non-zero vector
.
In order to carry over this idea to the noisy case (
), we would need to find out the statistical distribution of
under model (
1), and then select a threshold
such that
implies that model (
1) must be discarded and (
5) accepted with a given false-alarm probability
. We can simplify this task significantly if we start by describing the span of
in terms of a Gram-Schmidt basis. More precisely, let
,
,
denote a set of
signatures such that
; , .
has the same span as for , P.
In terms of this basis,
can be written as
and the test’s inequality would be
We carry out this plan in the next three sections:
We derive the ortho-normal basis
,
,
in
Section 3. The main tool for this is the set of discrete Chebyshev polynomials [
23], (p.33).
We obtain the statistical distribution of the correlations
,
, in
Section 4 by means of a perturbation analysis of the periodogram estimate
.
We present the proposed detector and estimator in
Section 5.
Finally, regarding the case of a periodogram with
significant peaks, it is straight-forward that (
1) and (
5) can be extended to the models
and
where
are separate frequencies and
disjoint intervals containing the diffuse components,
. The argument just presented for (
5) is applicable to any of the
K periodogram peaks, say the
kth, if its corresponding interval
is sufficiently separated from the other peaks’ intervals. However, if the separation with, say, component
is small then there can be a significant contribution of component
to the orthogonal component
of the
kth peak. We do not discuss this more complex case in this paper though we can envisage two possible strategies for dealing with it: we can increase the threshold
in (
12) in terms of the separation between
and any adjacent peak interval, or we can apply some kind of windowing in order to select just one periodogram peak. This second possibility appears naturally in the proposed method as shown in
Section 3.3.
4. Statistical Distribution of Correlations with Ortho-Normal Signatures
Consider the single-frequency model in (
1) and the corresponding periodogram estimator in (
4). Since
is the frequency at which
attains a maximum, it follows that
Computing this differential from (
3), noting that
, we have that
fulfills the implicit equation
In (
1), if we view
and
as fixed values and
as a vector of free parameters, we have that
is a function of
,
and the same occurs to functions of
such as the periodogram estimate
and the correlations
Besides, these functions have known values at
, namely,
,
,
and
,
. In
Appendix B, we derive the first-order Taylor expansions of
and of the correlations
for
but in the phase-corrected form given by
where
If we define
, the resulting expansions are the following:
For the model in (
1), these expansions imply the following approximate distributions:
-
is a real Gaussian variable of mean
and variance
Note that this variance is equal to the Cramer-Rao (CR) bound for single-frequency estimation; (see Eq. (17) in [
3]).
The correlations , , are statistically independent.
is a real Gaussian variable of zero mean and variance .
, , is a complex circularly-symmetric Gaussian variable of zero mean and variance .
5. Detection of a Diffuse Component. Spread Factor
The data vector
can be written in terms of the correlations
as
where, for short, we write
and
as
and
respectively. We consider that the first summation may be affected by a diffuse signal component while the second is mostly produced by noise. The choice of
P depends on the assumptions on the frequency distribution
in (
5) and also on the filtering effect discussed in
Section 3.3. The detection task consists of deciding between the single component model in (
1) (H0 hypothesis) and the diffuse component model in (
5) (H1 hypothesis). Under H0, the distribution of the correlations
,
is that derived in the previous section and from it we can deduce that the sum
follows a Chi-Square distribution of order
. Besides, we can determine a threshold
for which the condition
implies the presence of a signal component for a given false-alarm probability
. Finally, if H1 is accepted then a measure of the “diffuseness” of the periodogram peak is the ratio between the estimated signal components energies in the spans of
and
, i.e,
We call
the spread factor. Additionally, we define
if H1 is rejected.
6. Computation of Correlations from DFT Samples
The periodogram estimate
and the proposed detector and estimator require in their computation the value of one or more correlations of either the form
or
at arbitrary frequencies
f. These correlations involve summations with a large number of summands
N, namely,
for
,
P. Thus, we may expect a large computational burden if the summations in either (
48) or (49) are directly evaluated for any
p,
. However, the fact is that all the correlations in (
48)-(49) can be obtained with small complexity from just of a few of the DFT samples in (
13). Actually, as we show in the sequel, it is possible to obtain them from
samples of the DFT using a so-called barycentric interpolator [
18], where
is an integer that controls the interpolation accuracy. Specifically, the interpolation error decreases exponentially with
Q and, in practice, a small
Q is sufficient; (typically,
Q ranges from 3 to 6.) The method for doing this is based on the following ideas:
1) Note that the correlations
can be computed from the correlations
due to (
27), i.e,
So, it is only necessary to compute the correlations of the first type
,
,
P.
2) The initial correlation
can be written as
where
Additionally, any derivative
can be obtained form the derivatives of
of orders 0 to
P through Leibnitz’s formula for the
pth derivative of a product; i.e, the
pth order derivative of (
51) is
Therefore, the computation of the correlations
can be reduced to the computation of the derivatives
for arbitrary
u and
.
3) is a band-limited signal in
u of two-sided bandwidth
and, as a consequence, it can be computed from its samples with spacing 1 using the sinc series. More precisely, let
n and
v be defined by the modulo-1 decomposition of
u given by
The sinc series in
v for the signal
is
Note that
is equal to
which is one of the DFT samples in (
13). Therefore, all the samples
appearing in (
54) are already available.
4) Since there is oversampling in (
54), this series is also valid for a product
where
is a fixed signal whose two-sided bandwidth is below the limit
. So we have
Now, if
where
is a fixed pulse whose tails are negligible outside the
v range
for a given truncation index
Q, then we may truncate (
55) at
and extract the function
implicit in
, i.e,
Additionally, (
55) and (
56) also hold if
is replaced by 1 (which has zero bandwidth). So, we also have the approximation
Dividing (
56) by (
57), we obtain an approximation for
,
This is a so-called barycentric interpolation formula and we can readily see that it only involves arithmetic operations, given that the values
are constant in
u and can be pre-computed. Besides, (
58) can be extremely accurate, as shown in [
18], for a proper choice of
. Actually, for Knab’s pulse [
25] given by
the error in (
58) is bounded by
where
A is the maximum of
for
u in
. Note that this last bound implies that the error decreases as
and, therefore, a small
Q ensures high accuracy. Besides, the derivative of (
58) up to any order can be computed using the Horner-like scheme described in [
18], (Sec.IV). The complexity of obtaining
for
is
, i.e, independent of the correlation length
N.
From these ideas, we can readily deduce a method to interpolate (
48)-(49) consisting of the following steps:
Compute
,
by evaluating the barycentric interpolator in (
58).
Compute
from
,
, using (
53).
Compute
from
,
, using (
50).