1. Introduction
Single Particle Tracking (SPT) is an established technique for observing the behavior of single entities at high spatial and temporal resolution (nanometers and milliseconds) in various scientific fields such as Biology, Chemistry, and Physics. SPT is based on the reconstruction of the trajectories of single particles visualized in real time in the system of interest. In life science, for example, it has been applied to resolve the working mechanisms of molecules, organelles, and viruses [
1,
2,
3,
4].
The technique requires synergistic efforts in several aspects: instrumentation, particle labelling, data analysis [
5,
6,
7,
8,
9]. Experimental time-lapse images are processed with two main analysis steps: trajectory reconstruction and trajectory analysis [
1,
10]. The latter allows the extraction of various parameters characterizing the behavior of the tracked particles, such as type of motion, diffusion coefficient, velocity, and/or interaction events, thus finding the crucial links between the observed motions and the processes driving them. Over the years, different types of possible trajectory analyses have been developed, with the aim of extracting more and more detailed information about the mechanisms underlying the observed trajectories. With the proliferation of analysis methods, it is useful to have some overview of the different possibilities to identify approaches that may be useful for the application of interest. Available reviews on SPT typically aim to cover all the aspects of the technique, often including its applications [
1,
11,
12]. Here, we present a comprehensive review focusing on the last step of the technique, i.e., track analysis, since a work about this is still lacking in the literature and the topic has been given limited space in previous articles.
Firstly, we briefly review the most common and traditional methods of track analysis, which are based on the mean square displacement (MSD) function. In particular, we highlight some factors that are not always considered but that can affect the precision and accuracy of the results, especially the choice of the number of points to consider in the MSD curve for extracting parameters like the diffusion coefficient, and the effects of localization uncertainty and of some experimental parameters such as temporal resolution. MSD analysis is challenged by measurement uncertainties, and by the presence of too short trajectories and of heterogeneities, particularly in the case of anomalous motion [
13,
14]. It remains a valid tool, but various studies indicated that alternative and sometimes complementary approaches might allow obtaining more informative results. Distributions of parameters other than displacements have been considered, e.g., the angular distribution within tracks, which proved to be more sensitive for quantifying caging and distinguishing different and even rare transport mechanisms [
15].
Due to environmental heterogeneities, the presence of interactions or other processes, changes in motion type and parameters can also occur within a single trajectory [
16]. This demands methods to partition the trajectory into multiple segments, and to characterize the motion in each state and the switching behavior between them. In some cases, the presence of multiple states within single trajectories was masked in an MSD analysis, whereas it was revealed by more advanced approaches that allowed the characterization of multiple states with their switching kinetics and the consequent construction of models of the underlying mechanisms [
17,
18]. Hidden Markov Model approaches have been applied in some cases to classify states characterized by different diffusivities or types of motion and to extract populations and switching probabilities. We discuss the limitations and challenges of this kind of analysis as well, which are related to the definition of states in the case of non-pure Brownian motion and to the choice of the number of states.
Nowadays, classification is one of the main tasks solved by machine learning, which has therefore been applied also to trajectory analysis. In the last part of the review, we focus on this kind of methods, which greatly expanded in recent years. We describe approaches based on several machine-learning methods, from random forest to deep neural networks. Machine learning methods are demonstrating greater accuracy and sensitivity and are proving to be very advantageous for detecting hidden phenomena and extracting valuable results even from short and noisy trajectories. However, also in this case, several challenges arose, such as transfer learning to experimental data after training on simulated ones, time and computational requirements, interpretability and explainability. We focus on some examples that show how the integration of classical statistics can be very helpful in assessing the validity of the results and in defining the best set of features used by the machine-learning algorithm to classify trajectories.
2. Mean Square Displacement (MSD) Analysis
Mean Square Displacement (MSD) is the most common analysis in an SPT study. It is the second moment of the displacements distribution as a function of time lag, i.e., it quantifies the average squared distance traveled by a particle in a certain time (
Figure 1). It can be calculated as:
where N is the number of points in the trajectory, the latter being sampled at times Δt, 2Δt,… NΔt. To be more specific, the reported calculation is used to estimate the time-averaged MSD (TAMSD) for each single particle trajectory; this is the most used formulation. Another possible calculation is the so-called ensemble-averaged MSD in which the displacements are averaged over multiple particles at time t. The first one is preferred because of possible heterogeneities in a population of different particles, but tracks of sufficient length are not always available. The two averages can also be combined into the time- and ensemble-averaged mean-squared displacement (TEAMSD) [
19].
The MSD function can reflect the kind of particle motion. It is used to distinguish between immobile, pure Brownian, confined and directed motions (the latter usually having a component of Brownian diffusion and a component of active transport with a constant drift velocity), because these show different trends in their associated MSD [
20,
21,
22]. For Brownian motion, the MSD increases linearly with time, whereas linearity is lost for the other types of motion. The expressions used to fit the MSD function for the different motions are well established and can be found elsewhere [
23].
For each kind of motion, fitting the MSD function leads to the extraction of parameters describing the specific situation, e.g., the diffusion coefficient D for the Brownian motion, velocity and diffusion coefficient for the directed motion, confinement radius and characteristic time for the confined motion [
9].
In some cases, such as molecular labeling with organic dyes subject to photobleaching, the trajectories are relatively short and only the initial part of the MSD curve can be reconstructed, making it difficult to capture the nature of the movement from the MSD fit (16). In these situations, and when one wishes to measure a diffusivity regardless of the kind of motion, the slope of a straight line passing through the first MSD points is used to calculate a short-term diffusion coefficient [
2,
21,
22,
23,
24,
25].
The MSD function of a trajectory can also be fitted with a general law [
26,
27]:
where
is the generalized diffusion coefficient (or anomalous diffusion constant) and
is the anomalous exponent (
Figure 1): α close to 1 characterize Brownian motions, α < 1 corresponds to subdiffusion, α > 1 corresponds to superdiffusion. A log-log plot of MSD versus time is commonly used, where α is the slope of the curve in such plot [
28].
Estimates of D (from a linear fit) and α are often used together to assign the type of motion. For example, Jobin et al. performed an SPT study on GABA
B (γ-aminobutyric acid type B) receptors and classified the trajectories with D < 0.01 μm
2 s
−1 as immobile; the trajectories with D ≥ 0.01 μm
2 s
−1 and 0.75 ≤ α ≤ 1.25 as Brownian diffusive; the trajectories with D ≥ 0.01 μm
2 s
−1 and α < 0.75 as sub-diffusive, the trajectories with D ≥ 0.01 μm
2 s
−1 and α > 1.25 as super-diffusive [
29].
Anomalous diffusion is a phenomenon often found in SPT performed in different contexts in live cells, from membranes to cytoplasm and nucleus; this kind of motion has multiple origins, such as crowded environments, presence of interactions, different transport mechanisms; deciphering its nature and causes is very helpful in understanding fundamental cellular processes [
9,
28,
30,
31,
32,
33]. For descriptions of anomalous motion models, we refer to [
9,
34,
35,
36].
One area where MSD analysis of single-molecule trajectories has been widely applied concerns the study of membrane receptors in living cells. Since the work of Kusumi et al. [
20] this approach has been used to characterize the kind of motion of different receptor families, e.g., epidermal growth factor receptors [
20,
37], transferrin receptors [
20], neurotrophin receptors [
21], G protein-coupled receptors [
38,
39,
40]. Typically, four types of motion (stationary, Brownian, directed, confined) have been identified thanks to MSD analysis, and the changes in their population under different conditions have provided insight into the correlation between the type of motion and the function performed by the molecule [
21,
38,
39,
40]. The presence of different types of motion played a fundamental role in defining models of plasma membrane organization, including e.g., the existence of membrane compartments particularly associated with the cytoskeleton network [
20].
[
37,
41] However, caution must be taken in MSD analysis: some papers have pointed out the influence of experimental conditions (especially those related to uncertainties) on the accuracy and reliability of the MSD analysis and the extracted motion parameters.
It has been reported that, in the case of Brownian motion, the localization uncertainty (the so-called static error) produces a positive offset in the MSD curve, the finite camera time exposure (the so-called dynamic error) produces a negative offset and in the end, the MSD function for a 2-D trajectory has the following form [
41,
42,
43]:
where
is the diffusion coefficient,
is the standard deviation of the position of an immobile particle,
is the motion blur coefficient (which depends on the temporal illumination profile during the integration time, e.g., it is equal to 1/6 in the case of continuous illumination), and the other quantities are as defined above.
Static and dynamic errors must also be considered for other kinds of motion, but the formalization of their contributions is more controversial than in the case of the Brownian one [
44]. However, Backlund et al. investigated the effects of the two kinds of errors on anomalous motion while Destainville et al. studied the effects of the dynamic error in the case of confined motion [
44,
45].
One question that may arise concerns the optimal number of points of the MSD function to be used to extract the estimates of the motion parameters, i.e., how many and which points to use. Indeed, at large time lags, the MSD points are less averaged and thus subject to greater statistical fluctuations, and can less and less be considered independent measurements (there is correlation for the MSD at different lag times); at the same time, at shorter time lags, the MSD points are more affected by the contribution of the localization error [
46,
47].
The number of points used to estimate the short-term diffusion coefficient may vary between groups, depending on the specific conditions [
48,
49]. Two points (D
12, time lag 1 and 2) are typically used by some groups [
2,
22,
50,
51], three points (D
2-4, time lags 2, 3 and 4) are used by others [
52,
53].
It has been suggested to define a parameter called reduced localization error, given by the ratio
(where
is the frame time). Michalet found, through theory and simulations, that when
(uncertainty dominated by diffusion), the best estimate of the Brownian diffusion coefficient is obtained by considering the first and second points of the calculated MSD (without considering a (0,0) point); when
(uncertainty dominated by localization uncertainty), a larger number of points are needed to reduce the impact of localization uncertainty and their optimal number depends on
and on the number of time steps in the trajectory [
47]. Ernst et al. found on their experimental data (SPT on fluorescent beads) an optimal number of points of 4, i.e., from the second to the fifth point of the MSD (the first point extracted from the data was excluded for an influence of the orbital tracking method used to record trajectories), since this condition minimizes the ratio of the standard deviation over the estimated value of the diffusion coefficient [
46]. In a case with very long trajectories and r≈0.8, they also confirmed a good agreement of their results with the theoretical prevision of Michalet based on the reduced localization error, even though they found that in general the optimal number of MSD points does not seem to depend on the length of the tracks [
46].
Kepten et al. published a guideline for MSD analysis in the case of anomalous diffusion [
14]. They analyzed the precision and accuracy of the fitted MSD by simulating different conditions in terms of trajectory length, measurement error and anomalous exponent. In each case, they identified a maximum lag time to be used in the MSD fit that depends on the three parameters. Thus, they concluded that estimating beforehand the measurement uncertainty and the range of the anomalous exponent can significantly help to improve the accuracy of the results by selecting the optimal number of points for MSD analysis and fitting. Under some conditions (e.g., in the case of short trajectories and no estimation of the measurement uncertainty or relatively large measurement uncertainty), the MSD results were highly inaccurate.
We also have previously investigated the effects of different uncertainties and errors on MSD-based estimates [
54]. In particular, we considered the effects of different temporal resolutions on the distribution of the short-term diffusion coefficient in the case of pure Brownian motion (
Figure 2). We observed that variations in temporal resolution can cause shifts and broadening in such distribution. This type of effects is related to both localization uncertainty (static and dynamic, the latter producing motion blur effects) and tracking errors, which have different impacts on the diffusivity estimates depending on the interplay of temporal sampling with conditions such as diffusivity, density, signal-to-noise ratio. We showed that the observed effects cannot be fully explained by the available theory; e.g., the additive offset introduced in the MSD function to include static and dynamic errors [
41,
42,
43], discussed above, is not enough to take the effects we observed into account. Also in this case, preliminary evaluations of the impact of measurement and analysis uncertainties are helpful in choosing the proper experimental conditions, such as the temporal resolution, and to improve the accuracy and precision of the results. In addition to these caveats and necessary precautions, it is also important to note that a simple MSD analysis is not sufficient in the case of multiple modes of motion within a single trajectory, as we will discuss in more detail below.
3. Alternatives to MSD Based on Classical Statistics
An often-used alternative to the MSD analysis is the Moment Scaling Spectrum (MSS): while MSD focuses only on the second moment of the particle position distribution, the MSS considers more moments [
55]:
The moments of order
(typically from 1 to 6 [
56,
57]) are calculated for different time lags. The case
is the MSD. Each moment is assumed to depend on the time lag in the form
(eventually with an addictive offset to include the noise effects explained above). The plot of
versus
is called the moment scaling spectrum (MSS). By analyzing how these moments scale with time, one can understand if there could be distinct modes of motion in a single trajectory: for a self-similar trajectory, the MSS is a straight line passing through the origin and its slope reflects the type of motion. A slope close to zero characterizes stationary particles; a slope between 0 and 0.5 indicates a sub-diffusive/confinement regime; a slope close to 0.5 indicates pure diffusion; a slope higher than 0.5 characterizes the super-diffusive regime [
56,
57,
58]. Because MSS is based on more moments than MSD, it can provide a more complete characterization of the motion. MSS is more sensitive than MSD in detecting inhomogeneities in particle motion and is therefore more useful in analyzing complex and dynamic motion patterns [
59]; often it is followed by a segmentation of trajectories classified as inhomogeneous, based on parameters of (possibily various) MMS fits, and on their uncertainties. This kind of analysis has been used in many SPT studies; examples can be found in [
25,
57,
58,
60,
61,
62]. Marchetti et al. used the MSS analysis to distinguish between self-similar and multimodal trajectories in an SPT study on the neurotrophic receptor TrkA [
25,
61]. By combining MSS, MSD and TAD (Transient Arrest of Diffusion, detected by transient confinement analysis as described in the following, from [
63]) analyses, they obtained a classification of trajectories and subtrajectories into immobile, slow, fast, drifted, and TAD events. They moreover used bidimensional histograms of
versus D
12 (the short-term diffusion coefficient obtained by MSD,
Figure 3) measured for the receptor under different conditions (stimulation by different ligands or administration of different drugs to cells), identifying different dynamic regions. The population of these regions was variable depending on the stimulation conditions; an analysis of these changes allowed associating each region with a behavior of the receptor (such as free-diffusion, formation of signaling platform, assembly of signaling endosomes precursors). The authors concluded that the receptor dynamics on the plasma membrane is a specific signature of its activation state and is ligand-dependent (different ligands produce different dynamics patterns, called “ligand fingerprinting” effect); therefore, the receptor dynamics induced by ligand binding has a strict correlation with the induced biological response.
An important type of motion for biological SPT applications is the one caused by confinement; considering for example membrane proteins, this can be due to clustering, to cytoskeleton-mediated compartmentalization or to the presence of lipid domains, and is often associated with the activation of receptors or the regulation of their density in specific areas [
25,
64,
65,
66]. Indeed, one of the first methods to detect changes in motion within trajectories focused on the detection of transient confinement. Such periods of motion were identified by calculating the probability of remaining in a region for a period of time significantly longer compared to the case of a pure Brownian motion; the analysis was performed by testing the average displacement per frame of a particle in a moving window, and allowed separating a track in different segments [
63]. This method has been widely applied and continues to be so [
25,
67,
68,
69]. There are also other more general approaches based on segmentation and classification applied on different rolling windows within tracks, for example using MSD analysis in each window [
70,
71] or considering velocity statistics (e.g., average, median, and/or maximum, considering also its direction) in each window [
72,
73,
74,
75]. One challenge with this type of strategy is the choice of the window length; indeed smaller windows are required to increase the sensitivity for the detection of motion switches, but larger windows are required to increase the accuracy of motion classification [
70,
76]. Thus, in some cases, this type of approach fails to detect short-lived motion types or gives variable results depending on the window size [
25,
77].
The group of Jaqaman developed a method called “divide-and-conquer moment scaling spectrum” (DC-MSS), which uncouples the detection of motion switches from motion classification, overcoming the main limitation of traditional rolling-window-based approaches [
76]. After a first segmentation based on a motion parameter (“local movement descriptor”), the motion classification is performed via the MSS analysis, which is also exploited to refine the initial track segmentation. One reported limitation was the requirement of relatively long periods (at least 20 frames) characterized by the same type of motion for correct identification. However, the method was tested on the cell membrane protein CD44 tracked in macrophages, and was able to distinguish changes in mobility due to interactions with the actin cortex [
76].
Several other approaches, typically based on the distribution of parameters other than displacements, have been proposed to increase the amount of information extracted from trajectories, in particular to gain more insight into the underlying mechanisms governing the observed system [
15]. A relevant example is the distribution of directional changes within trajectories [
15,
78,
79]. Burov et al. considered relative angles at different time scales: the trajectories were sampled with a certain time resolution, the resulting points were connected by vectors, and the angles between successive vectors were measured. Relative angles have been shown to have different distributions for different types of motions, and by using variable time intervals it is possible to capture the time scale over which the kind of dynamic is relevant [
78]. Distribution of angular displacements can be plotted using radial histograms (
Figure 4). As observed by Harrison
et al., the angular distribution and the MSD analysis can be used as complementary approaches; in this way, the authors analyzed the motion of lipid droplets in living cells and were able to identify different motion regimes (characterized by different angular persistence and different features of sub/super-diffusivity) at different time scales [
79], with transition points between the regimes that coincide in the two types of analysis. The determination of directional persistence and associated time-dependent directional changes have been performed by analyzing the distribution of a turning angle between segments of the track as a function of the lag time [
15]. Directional persistence may be indicative of an active transport mechanism driven, for example, by molecular motors [
15].
Another quantity used in the case of the study of transport mechanisms is velocity. The distribution of velocity within individual particle trajectories has been used, for example, to characterize the motion of the motor protein Myosin V in the cytoplasm of live cells [
80], the axonal transport of vesicles carrying neurotrophins in neurons [
72,
73] or the endocytic pathway of the influenza virus [
81].Tejedor et al. proposed a method based on a mean maximal excursion statistic (MMEs) for the analysis of different kinds of anomalous subdiffusion motions [
82]. The maximal excursion is the largest distance covered by a particle in a time interval; an average is performed over all the trajectories. They showed that MSD analysis is not able to discriminate between two models used to simulate different kinds of anomalous diffusion, which give rise to the same anomalous exponent; instead, the additional observables they introduced, based on the moments of the MME, allowed them to distinguish different subdiffusion mechanisms and to obtain more precise estimates of the anomalous exponent [
82,
83].
A different approach developed for investigating subdiffusion motion is based on “first-passage observables”, explained in the following. Condamin et al. [
84] also observed that subdiffusion can arise from different mechanisms leading to similar scaling laws, which can therefore not be distinguished by MSD analysis alone. The authors considered two different microscopic models of subdiffusion: (i) continuous time random walks (CTRWs) with a heavy-tailed distribution of waiting time, considered dynamic in nature, where the high waiting time could be caused by a crowded environment or to metastable chemical binding (in “traps”); (ii) motion with fixed fractal obstacles. These models are considered relevant for anomalous subdiffusive motion in living cells, where crowding can be due to the high density of molecules, aggregates or “traps” e.g., in the plasma membrane or in the cytoplasm, and fixed obstacles can be created by membranes in the cytoplasm or cytoskeletal elements in both plasma membrane and cytoplasm. They based the analysis on three “first-passage” observables: 1) the time it takes a particle from a given site to reach a target for the first time (first-passage time); 2) the probability of reaching a given target before reaching another target (first-passage splitting probability); 3) the time a particle spends at a given site (occupation time). The distribution of these variables are different in the two subdiffusive mechanisms, providing information on the underlying nature of the processes, and, e.g., allowing explaining the kinetics of reactions especially when transport-limited (i.e., in conditions of low concentrations of reactants).
Izeddin et al. studied the target-search strategies of two different transcription factors in the nucleus using SPT [
31]. The two molecules were the proto-oncogene c-Myc and the positive transcription elongation factor P-TEFb. The authors showed that these two transcription factors adopt different strategies to explore the nuclear space and search for their targets: c-Myc showed free diffusion, more specifically c-Myc uses a “non-compact” strategy, moving everywhere in the nucleus with an equal chance of reaching any target regardless of its position; P-TEFb showed a motion confined by fractal structures, i.e., P-TEFb follows a “compact” behavior in which a specific path guides it to its potential targets. The authors used analysis based on MSD, angular distributions and first-passage variables to support their conclusions, providing an interesting example of how different methods of analysis can be exploited together and complemented to gain more insight into the observed trajectories and to construct more informative descriptions of the processes under investigation. Their results showed that the nuclear architecture is protein-specific and that the geometry of the exploration paths is important for regulating transcription factor function and controlling gene expression.
Other approaches introduced especially to investigate anomalous motions, e.g., to distinguish between different types of subdiffusion, are based on p-variation (i.e., on how p-variation statistics change with the exponent p and with time, see [
83,
85,
86] for details), on the velocity autocorrelation function [
34,
87,
88], on the displacement correlation function [
89,
90], the power spectral density of the displacement [
91,
92,
93], and on the amplitude scatter of the MSD when considering multiple trajectories/particles [
94,
95,
96].
Weron et al. recently published a study comparing different analysis approaches on SPT data obtained on G proteins and G-coupled receptors with the aim of classifying their dynamics [
97]. First, they used a standard MSD approach based on the use of thresholds for the anomalous coefficient to define the regions of sub, normal or super diffusion. Due to the strong influence of the cutoff values on the results, they turned to statistical hypothesis testing. They used the anomalous exponent as the test statistic in a two-sided approach, testing the null hypothesis of free diffusion against the two alternatives, and calculating the critical regions after choosing the significance level. They also applied the statistical testing approach using statistics derived from the maximum excursion (similar to the MMEs) and from p-variations. The classification of trajectories leads to different results depending on the used method. The standard MSD approach underclassifies free diffusion and overclassifies superdiffusion; p-variations is the method that gives the highest percentage of subdiffusion. With the help of simulations, they showed that the results depend indeed on the type of process. The statistical approach is more robust than the standard MSD based on cutoffs; but none of the tested statistics perform at best in all the cases and each one gives the best result in specific situations (kind of motion, length of tracks). They suggest combining the approaches by trying to apply all the methods. If the classification results are compatible, any of the methods can be used; if not, they propose either to calculate an average of the results, or to use other methods to identify the process driving the movement and applying the most accurate test in such a case.
4. Markov Modelling
The task of trajectory segmentation and classification in SPT can be approached by methods based on Markov Modelling, which is used to describe systems that can exist in different states with certain transition probabilities between the different states, where these probabilities depend only on the current state and not on previous ones. In SPT, Hidden Markov Models (HMMs) are typically used, in which the Markov process is assumed to have unobserved (hidden) states and the observations made provide indirect information about the state. In the end, HMMs allow characterizing different diffusive states and estimating the probabilities of transition between them.
An example is the work by Cairo et al. on the mobility of CD45 in T lymphocytes investigated by SPT [
18]. They applied both MSD analysis and HMM analysis. Only HMM could detect heterogeneities and motion transitions within a single trajectory (masked in MSD analysis), extracting the switching kinetics between the two identified states. HMM allows obtaining association and dissociation rates of the interactions between CD45 and the cytoskeleton, and designing therefore a model for these interactions.
In several cases, the HMM analysis is applied considering a random switching between (typically two) diffusive states distinguished by different diffusion coefficients [
98,
99,
100]. However, some studies showed that in certain cases, when there are also non-pure Brownian motion types, such a description of states is not able to detect changes in the kind of motion. Monnier et al. observed this problem when considering trajectories that included switching between directed and random motion. To detect direct transport, they had to develop an HMM analysis that explicitly included models of both diffusive and directed motion, with Bayesian selection inferring the one consistent with the displacement along the tracks. With this approach, they were able to detect the transient transport states from the tracks of mRNA-protein complexes in living cells [
77].
Slator et al. also found that an HMM approach that only models states with different diffusion coefficients, and thus detects transitions as changes in diffusivities, is not suitable for segmenting the trajectories of ganglioside GM1 in model membranes. A specific modeling of confinement was necessary and was incorporated into an HMM analysis based on the switching between free diffusion and confinement in a harmonic potential well, potentially diffusing slowly. The developed model allowed the detection of the two states, their switching times and their parameters, i.e., the free diffusion coefficient, the strength of the potential well, the position of the potential well center and its diffusivity [
101].
A HMM was developed also for the identification of states with two different directional persistences inside single particle trajectories, in order to study different intracellular transport mechanisms [
102]. The authors achieved a good separation into “active” and “inactive” states. However, they recognized that modeling with only two states (forward and backward turns) was a crude simplification to keep low the number of model parameters.
One challenge with HMM models is finding an objective way to determine the number of states. Most approaches used a fixed number of states. Instead, Persson et al. developed a method (called variational Bayes SPT, vbSPT,
Figure 5) for HMM-based motion analysis, which infers the model parameters, including the number of diffusive states, from the data [
103]. Each state is described by a diffusion constant and a state lifetime. Identifying the correct number of states requires a compromise between the goodness of fit and the complexity of the model. The approach has been applied to the intracellular diffusion of the RNA-binding protein Hfq and the different identified diffusive states have been mapped into corresponding binding states. vbSPT is freely available as a software package [
104]. This software has been used also to identify four different states in the mobility of G proteins and their receptors, observed by SPT on the cell membrane [
105]. For each state, the authors extract the diffusion coefficient, occupancy fraction, dwell time, and transition probability from the HMM analysis.
A combined MSS and HMM analysis was performed by Gormal et al. in an SPT study on β2-adrenergic receptors [
106]. The MSS revealed a classification of the trajectories into three motion modes: immobile, confined and free. The HMM confirmed a three-state model (five states allowed, best fit using vbSPT software resulted in three states), in this case characterized only by different diffusion coefficients (one so low that the state was called “immobile”). The investigation allowed the authors to identify the population of the different states and to study the activation mechanism of the receptor.
Falcao and Coombs also addressed the issue concerning a fixed number of states [
107]. They considered a method based on a so-called infinite HMM (iHMM, an extension of HMM to an infinite number of states) which exploits a non-parametric Bayesian approach. It infers from the data the number of states, the transition rates and the diffusion coefficient defining each state. The software is freely available [
108]. As observed for other HMM models cited above, only pure diffusive motion with different diffusion coefficients was considered; further improvements could include other motion kinds such as confinement or directed motion.
5. Machine Learning Analysis
Classification is one of the main tasks that machine learning (ML) can solve today in a wide range of applications, and the field of trajectory analysis in SPT also benefits from the recent explosion of this type of approaches. In the following, we give examples of the numerous methods based on this branch of artificial intelligence (AI) that have been introduced to overcome the limitations of previous strategies.
Figure 6 shows a general example of trajectory classification by supervised ML, a type of algorithm particularly used for these purposes. It involves training a model on a labeled dataset, in which each training example is paired with an output label; the model learns to make decisions based on this input-output mapping.
Wagner et al. developed a random forest classifier (an approach that exploits an ensemble of decision trees [
109]) to segment and classify particle trajectories into four main motion types (confined, normal, directed, anomalous diffusion). They used nine features to characterize the trajectories: the anomalous exponent, asymmetry, efficiency (which relates the squared net displacement to the sum of the squared displacements), fractal dimension, gaussianity, kurtosis, mean squared displacement ratio (between two different time lags), straightness (relates the net displacement to the sum of step lengths and trappedness (probability of trapping) (for more precise definitions we refer to the original publication [
110]). The method, after training on simulated tracks, was able to infer the motion type and its parameters for each segment under different experimental conditions. The developed software is freely available as an ImageJ plugin (TraJClassifier, available at [
111]).
An ML random-forest-based method was also developed by Muñoz-Gil et al. to classify tracks showing normal or anomalous (sub- or super-diffusion) motion with the calculation of the anomalous exponent [
112]. The approach also worked for short tracks and in the presence of noise, and the Authors showed its ability of transfer learning to data generated with theoretical models not included in the training set.
Another software package developed for the analysis of SPT data under difficult conditions, i.e., the presence of short trajectories and heterogeneities, is
DiffusionLab [
113] (
Figure 7). In this package, after importing the trajectories, a set of features is calculated for each of them. The software includes some predefined features, and user-defined ones can be added by creating new code files. The predefined features are: number of points (number of consecutive localizations), length (sum of all the individual displacements), radius of the minimum bounding circle (radius of the smallest enclosing circle that can be drawn around the localization coordinates, describing their spatial extension), distance between the center of the minimum bounding circle and the center of mass (describing how homogeneously points are spatially distributed), elongation (weight of the first principal component of localization coordinates), elongation angle (direction of the first principal component of localization coordinates), entropy (measurement of spatial randomness), tortuosity (the ratio of the distance between start and end points versus the length of the track, describing start-to-end directionality) (see [
113,
114]). A classification model is built either manually (by setting user-defined thresholds on the features) or by machine learning through a hierarchical classification tree (other supervised machine learning tools available in MATLAB can also be used). In the case of machine learning, the thresholds and trajectory features to be considered are selected automatically after training the model on a suitable dataset (up to 5 classes allowed). The classification model is used to group tracks into populations with similar behavior. Motion analysis is performed either on each track or on each group of tracks: time-averaged MSD analysis can be performed on each track if they are sufficiently long; however, if the tracks are too short, this measure is not accurate, so it is averaged over multiple tracks within the same group to obtain more robust estimates of motion parameters. The software requires MATLAB and is freely available from [
115]. DiffusionLab supports importing tracks from three SPT software: DoM (Image-J plug-in), Localizer (Igor Pro plug-in), and COMSOL Multiphysics (a commercial software for simulations) [
114]. Although the developers state that support for importing other formats can be requested, a useful improvement might be the ability to automatically import from some of the most widely used SPT software for life sciences, such as u-track and TrackMate [
116,
117].
Pinholt introduced a general approach for SPT analysis (called “diffusional fingerprint”) that allows the extraction of diffusional patterns of tracks independently of the underlying diffusion type [
118]. It is based on a set of 17 descriptive features, providing the advantage of not requiring a priori assumptions about the type of movement, unlike model-based analysis. Classification is achieved by a pattern recognition algorithm based on ML, which trains a logistic regression model. The Authors demonstrated that training and prediction can be performed on experimental data, eliminating the need for pre-training on simulated data and the assumption of correspondence between simulations and experiments. To train the model to understand when two processes are dissimilar, different experimental conditions were used. The label in the training on experimental data corresponded to the different observed experimental conditions. The distribution of features and the ranking of features of relevance were used in the application of the trained model to decide whether two diffusion processes were different and to infer important diffusion differences in microscopic motion. The method has been tested on a variety of simulated and experimental scenarios, showing great flexibility and applicability in many cases. It accurately assigned diffusional characteristics regardless of the type of dynamics, using the same 17 characteristics in all cases, providing a general way of mapping different phenomena in a common space.
Dosset et al. developed an approach based on a back-propagation neural network (BPNN, a type of neural network that uses the back-propagation algorithm for training, where the term “back-propagation” refers to the way the model adjusts its weights based on the error rate obtained in the previous iteration [
119]), which takes in input the MSD calculated on a sliding window along the trajectory [
120]. After training on simulated trajectories exhibiting Brownian, confined and directed diffusion, the algorithm can distinguish the three types of motion even within a single trajectory.
Kowalek et al. proposed a deep-learning convolutional neural network (CNN,
Figure 8) for track analysis in SPT [
121]. Unlike traditional ML methods, which require user-defined features, deep learning approaches require no data preprocessing and extract the relevant features automatically, without human intervention. CNN excels at tasks involving structured grid data, particularly images, by leveraging convolutional and pooling operations to extract and learn spatial hierarchies of features from input data [
122]. Their use for time series and trajectory analysis is an emerging field since trajectories can be encoded into a 2D grid (image-like structure) [
123,
124]. Kowalek et al. compared the deep learning CNN with classical ML methods (random forest and gradient boosting, both based on decision trees) for the identification of different motion modes in SPT. All models were trained on simulated data (which included normal, directed, confined and anomalous diffusion). CNN showed a slightly higher accuracy than the other considerer methods in most cases, but it required longer processing times. The authors found that, as expected, the ML methods generally failed to classify correctly motion models that were not included in the training data, even if they produced similar diffusion trends, such as different types of confined diffusion, with CNN being the worst in this respect. The study was carried out considering only homogeneous motion within tracks, i.e., a single mode of motion per track; the authors speculated that the CNN approach might show more advantages over the other ML models in the presence of heterogeneities within single tracks, since most of the human-defined features typically used relate to MSD estimates, which are worst for short track segments.
Based on the observation made in the last discussed work, in particular that i) all models failed to generalize the knowledge to unseen (not present in the training data) types of motion and ii) that traditional ML is cheaper in terms of time and computational resources and more interpretable than deep-learning, Janczura et al. improved the random forest and gradient boosting methods, proposing a new set of features that characterize the tracks and can be used for their classification [
125]. For the choice of the features, they considered the study by Weron et al. ([
97], discussed above), in which the author compared classical hypothesis testing on statistics obtained by MSD, maximum excursion and p-variations, and found that none was best in all cases, thus suggesting a combined approach. Following this advice, the new set of features chosen by Janczura et al. includes quantities based on the diffusion coefficient, the anomalous exponents, the maximum distance, the p-variations. This new set is used to train the ML models and classify trajectories. They showed improved knowledge transferability, and, very importantly, they showed that choosing the right features is a crucial point, greatly affecting the accuracy of the results. Moreover, knowing the most and least important features helps to retain only those that actually drive the results and to omit the less informative ones, in order to reduce dimensionality and make the model faster. p-variation turned out to be the most informative feature, the diffusion coefficient the least, with the removal of the latter changing the relative importance of the remaining features.
Granik et al. were interested in resolving processes that produce different kinds of anomalous diffusion, even starting from short trajectories. They presented a deep-learning approach, based on a set of convolutional neural networks (trained on simulated data), to classify trajectories by three types of diffusion (Brownian and two kinds of anomalous diffusion, i.e., fractional Brownian motion (FBM) and continuous time random walk (CTRW)) and to infer motion parameters. Compared to traditional MSD-based analysis, they demonstrate greater accuracy, the need for much less data to achieve the same precision, greater ability to extract information from even very short tracks and increased robustness to noise [
126].
In 2021, a competition was held to evaluate methods for detecting and characterizing anomalous diffusion, the Anomalous Diffusion (AnDi) Challenge [
127]. The participating teams used their approaches to solve three tasks: determination of the anomalous diffusion exponent; classification of the diffusion mode; track segmentation with the identification of change points within tracks, where the anomalous exponent and the diffusion mode change, and the determination of the motion model and the anomalous exponent for each segment. The third task turned out to be the most difficult, even in the relatively simple condition of a single change-point per track, and, in fact, far fewer teams participated for this task than for the first two. The author observed that the traditional MSD approach has several limitations, especially in the case of short and noisy trajectories, heterogeneous motion, non-stationarity, non-ergodicity; furthermore, distinct kinds of physical processes produce the same scaling exponent, making it impossible to distinguish the underlying model. Most of the models used in the competition outperform the performance of traditional MSD analysis. The best models are based on machine learning (ML) approaches, showing that ML can extract more information than classical statistics. However, the authors noted that classical statistics can still be helpful, especially because of the black-box nature of ML.
Starting from the results of the AnDi competition, Seckler et al. published a recent perspective focusing on the improvement of the explainability and interpretability of ML approaches that were successful in the challenge. In particular they investigated the possibility to extract a set of statistical features to be used in the ML algorithms instead of the raw position data and to determine uncertainty estimates. [
128]. Anyway, the authors observed that the analysis with classical statistical methods is still necessary to assess the validity of ML results and to prepare training data as possible models, and that noise not included in the training data can lead to inaccurate predictions.
The results of the AnDi challenge were also exploited as a starting point by Manzo to develop a method based on Extreme Learning Machine (ELM) applied to a set of engineered features (AnDi-ELM [
129]). ELM is an algorithm for single hidden layer feedforward neural network with a much faster learning speed than other ML methods [
130,
131]. The features used in AnDi-ELM are based on estimators from classical statistics, i.e., the scaling exponent of the moments of the displacement distribution, the time-correlation of displacements, the cumulative sum of squared displacements. With minor modifications to the entry features, the method is able to cope with the three tasks of the challenge, although it performed better on the classification task than on the regression with anomalous exponent estimation. The overall performance was considered satisfactory, especially as the method allowed for reduced training time, low computational cost and undemanding implementation compared to other ML approaches. Thus, the author suggested that it may be particularly useful for preliminary screening prior to further and more time-consuming evaluations.
The AnDi challenge stimulated the development of another ML-based method for characterizing anomalous diffusion, called WADNet and published by Li et al. [
132]. This is a deep neural network based on a modified version of WaveNet (the latter being a deep neural network developed for the generation of raw audio) combined with Long Short-Term Memory (LSTM) networks, a particular type of recurrent neural network used specifically for the classification of sequential data thanks to its ability to capture historical information. The method was applied to the AnDi challenge dataset, and proved to outperform the best methods found in the original AnDi challenge for the first and the second tasks (inference of the anomalous exponent and classification of the diffusion model). The main architectures of WADNet are almost the same for the different tasks, with modifications concerning the input and output dimensions of the network.
Verdier et al. proposed the use of graph neural networks, a class of deep learning methods that use descriptions of data by graphs [
133]. In their approach, a vector of features is associated with each position in the trajectory and a graph is associated with each trajectory. Training is performed on simulated data. The strategy was tested on the anomalous motion modes proposed in the AnDi challenge and it showed the ability to infer the motion model and the anomalous exponent.
A new competition has just been launched from the same group at the beginning of 2024: The 2nd Anomalous Diffusion Challenge, which aims to compare methods for detecting motion changes in single-particle motion [
134]. It proposes an analysis of both raw videos and trajectories; the aim is to identify transient changes of the generalized diffusion coefficient and/or anomalous exponent, transient interactions, transient confinement, transient immobilization. The challenge is organized into two tasks. The first task is to infer the simulated model, the number of states, the time spent in each state, and for each state, the mean and standard deviation of the distribution of the generalized diffusion coefficients and of the anomalous diffusion exponent. The second task is to find the change points for each trajectory and then, for each segment, to find the type of motion, the generalized diffusion coefficient and the anomalous diffusion exponent. The competition is still open [
135].
6. Discussion and Conclusions
We reviewed the different analysis methods available today for the analysis of trajectories obtained by SPT, along with the different advantages, potentials and open challenges for each of them.
Traditional MSD-based analysis remains a well-established and valid tool, important for historical reasons and easy to apply and interpret. Its use in SPT allowed obtaining important insights into molecular dynamics in different types of applications. However, as we have discussed, its application requires careful consideration of some aspects that have emerged over time and that have not always been taken into account, particularly in relation to the impact of experimental uncertainties. Several studies have highlighted the limitations of analyses based on MSD, related to the presence of heterogeneities within tracks and to data consisting of few, noisy and short trajectories. Furthermore, conventional MSD analysis often fails to characterize anomalous motion types because different physical mechanisms can give rise to the same scaling behavior, making MSD incapable of distinguishing the different underlying physical diffusion processes.
Many alternative approaches have been developed for the analysis of SPT trajectories. Classical statistics have been based on parameters other than displacements, such as angular or velocities distributions, mean-maximum travelled distances, passage times.
A class of methods has been introduced for the analysis of transient behaviors that are masked in MSD analysis, e.g., approaches based on the detection of transient confinement or analysis within sliding windows. Indeed, amongst the important objectives in SPT trajectory analysis there are segmentation and classification, as trajectories typically show heterogeneous motions, reflecting the heterogeneity of the environment, the presence of interactions or other processes. Hidden Markov Models have been exploited quite widely for this task; the state is usually described by a diffusion coefficient and transitions between states with different diffusion coefficients are investigated. A challenge in this area concerns the transitions between states with motions other than pure Brownian, for which a description of states based simply on a diffusion coefficient is not sufficient. The a priori selection of a fixed number of states has also been a limitation of HMM models in some cases, and this has been addressed and overcome in some of the studies we discussed above.
Finally, we described methods based on machine learning. Classification is one of the main tasks nowadays approached with this kind of methods. Therefore, in recent years, many algorithms based on machine learning have been applied to trajectory classification. They use different strategies, from more traditional ML, such as decision trees, to deep learning. Most of these methods are based on a set of track features to be used as input of the ML algorithm that classify the tracks. Different features have been exploited by the different methods: they can be defined in the development of the model, can be expanded by the user in some cases or can be automatically extracted in the case of deep learning. Classification is performed amongst different motion types or in a feature space. Identifying the best set of features is a critical issue and can greatly influence the results. Typically, features consisting of variables calculated with MSD and classical statistics have been used. Also in the field of machine learning, several challenges have been reported, such as lack of interpretability, high time and computational costs for training, difficulties in transfer learning due to the needed or possible training on simulated data. We also reported some freely available software for analyses based on machine learning, exploitable even by users without programming experience.
Despite the numerous available analysis methods, the characterization of anomalous diffusion and track segmentation are active fields, as we have highlighted in the discussion of recently launched competitions. Challenges such as AnDi are useful to compare available methods, to stimulate the development of new ones and to establish common datasets.
In conclusion, the analysis of SPT trajectories encompasses a wide range of methods that are constantly evolving. The traditional MSD is still used and useful and no other one is as widespread and established; however, alternative methods can now help to overcome its limitations. A successful strategy may use a combined approach that takes advantage of more methods, such as a combination of classical statistics and machine learning. This review can be a useful guide to identify methods suitable for the situation of interest, and an inspiration for new ideas and improvements.