3.1. Seismic data
In this research, we gathered comprehensive seismic data to construct a high-resolution aftershock catalog of the 2014 Ludian earthquake. The data collection involved continuous waveform and manually picked phase data from the Sichuan Seismic Network, the Yunnan Seismic Network, and the Qiaojia campaign seismic station network.
The initial step in data handling was a rigorous assessment of data quality from all seismic stations. This involved evaluating the continuity of waveform data and excluding stations with consistently poor data quality. Then we meticulously selected 12 campaign stations and 12 regional permanent stations to locate Ludian aftershocks, considering both data quality and geographical significance. The selection criteria aimed to achieve an optimal balance between station quality and location to enhance the effectiveness of aftershock localization. The observation period for these 24 seismic stations spanned from August 1, 2014, to December 31, 2015, with data continuity exceeding 90%. To verify the consistency of instrument responses, especially in campaign stations, we conducted comparative analyses using instrument responses to surface waves from global earthquakes with magnitudes greater than 7.0. This comparison, which involved examining recordings from different stations for the same global seismic events, focused particularly on the response to long-distance seismic surface waves. This procedure confirmed the consistency of instrument responses across stations, which was crucial for the accurate determination of earthquake magnitudes.
The construction of the seismic AI training dataset was a crucial step in the development of a reliable and effective deep learning model for seismic event detection and phase picking. Our dataset was structured into three distinct sets: seismic event samples (positive), noise samples (negative), and seismic phase-picking samples. The Convolutional Neural Network (CNN) model was tailored to learn from the event and noise samples, while the Recurrent Neural Network (RNN) model was specifically trained using the seismic phase-picking samples.
For the generation of phase-picking samples, 55 regional permanent stations from the Sichuan Seismic Network were selected to create sample datasets for training the Recurrent RNN model. These seismic records, encompassing continuous waveforms and P & S phase data, were collected from January 1, 2008, to December 31, 2012. The accuracy of the P & S phase arrivals was ascertained using the theoretical travel-time versus epicentral distance relationship. We extracted earthquake events from 24-hour continuous waveforms based on the arrival times of P & S phases. These events were then represented as 30-second seismic waveform segments, with the P & S arrival times annotated on each segment. To ensure the quality of the data, the signal-to-noise ratio (SNR) for all samples was calculated, and samples with exceptionally high or low SNR were excluded. Ultimately, this meticulous process yielded a dataset comprising 121507 event samples annotated with manual phase labels, providing a rich and reliable foundation for the subsequent deep learning training and aftershock cataloging.
For the generation of positive (seismic event) and negative (noise) samples, we employed a multi-faceted approach integrating the STA/LTA algorithm, Kurtosis algorithm [
23], and a seismic association algorithm embedded within the PALM framework. This approach was applied to waveform data from the Ludian region to extract both event and noise samples efficiently. The detection of P and S arrival pairs for seismic events was accomplished using both the STA/LTA picker and the Kurtosis picker. Once all P and S arrival pairs were detected, the seismic association algorithm was employed to associate phase arrivals across all stations, thus forming distinct seismic events. These detected events constituted our event sample dataset. Noise samples were randomly extracted from 24-hour continuous waveform recordings. A 30-second window was deemed a noise sample if it lacked any P or S phase arrivals. This method ensures the inclusion of diverse noise characteristics in the training dataset.
After assembling all sample sets, we conducted a thorough analysis of various attributes within the dataset. This included assessing the distribution of signal-to-noise ratios, epicentral distances, azimuths, and the relationship between travel time and epicentral distance (see
Figure 2). Such an analysis is essential to understand the dataset's diversity and to ensure that the deep learning models are trained on a representative and comprehensive sample of seismic data. To provide a visual understanding of these distributions, refer to
Figure 2 in the paper. This figure illustrates the range and variability of the aforementioned attributes within the dataset, offering insights into the data's complexity and the challenges it poses for AI-based seismic analysis. This comprehensive preparation and analysis of the seismic AI training dataset lay the groundwork for the effective training of deep learning models, enabling them to accurately discern between seismic events and noise and to proficiently identify seismic phases, thereby contributing to the advancement of seismic research and aftershock analysis.
3.2. Methods
The comprehensive AI detection workflow is depicted in
Figure 3. Unlike the procedural framework presented in Zhou et al. (2021)[
24], the seismic phase data used in training the RNN model were manually picked by experts. In contrast, the sample dataset in Zhou et al. (2021) [
24] was obtained through detection by the PAL-picker. The AI model in this study integrates a hybrid CNN & RNN structure, an approach previously developed by Zhou et al. (2019)[
4]. Within this combined CNN & RNN model, the CNN deep neural network is composed of eight convolutional layers, Rectified Linear Unit (ReLU) non-linear activation functions, Max Pooling layers, and fully connected layers. The forward propagation procedure is defined by a loss function based on the L1 norm, while the backward propagation utilizes the Adam optimizer. The RNN features two bidirectional Gated Recurrent Unit (GRU) layers, which process data both forwards and backwards in time. In the RNN, each layer’s current state is influenced by both the input at the present time step and the hidden state from the preceding time step.
Phases association incorporates clustering analysis in both time and space domains[
18]. Temporal association is achieved by searching for clusters of earthquake occurrence times. Spatial association is accomplished through grid searching for the hypocenter position with the minimum travel time residual. The phase association procedure ensures the detection of the same seismic signal by a minimum of four stations, thereby reducing the likelihood of misidentified signals. The magnitude estimation is determined by calculating the body wave magnitude using the S-wave amplitude. The earthquake localization utilizes Hypoinverse[
25] for absolute location and HypoDD[
26,
27] for relative relocation (Klein, 2002; Waldhauser, 2000).
3.3. AI model training
To improve the performance of the AI model under specific tectonic and observation conditions, we rapidly constructed a sample dataset for the local region and retrained the AI model. The methodology for curating the AI training dataset has been delineated in the preceding section. Before feeding samples into the AI model, data augmentation techniques were employed on the original dataset to enhance its robustness. This included temporal adjustments of the sampling window and the infusion of varying degrees of white noise. Specifically, the P-wave arrival served as the temporal anchor around which five random shifts within a 15-second interval were executed, each accompanied by the introduction of white noise ranging from 0 to 40%.
The hyperparameters, notably the learning rate and batch size, are pivotal in the AI training process, influencing model convergence, the risk of overfitting, and computational efficiency. An inordinately high learning rate can precipitate non-convergence, while an excessively low learning rate may unduly protract the convergence timeline. Similarly, a minuscule batch size could prove inadequate in counterbalancing the stochastic influence on gradient estimation, whereas an excessively large batch size could lead to protracted iteration durations. For the training executed in this study, we utilized hardware equipped with a GeForce RTX 3090 GPU, boasting 24GB of memory. A learning rate of 0.001 was established for both CNN and RNN training, with a batch size of 512. The training dataset was apportioned into training, validation, and test sets following a 7:2:1 ratio.
Detection accuracy was quantified as the proportion of correct predictions derived from the training dataset during the CNN training epoch, while validation accuracy was ascertained using the validation dataset. An observed increment in detection accuracy concomitant with a decrement in validation accuracy typically denotes the phenomenon of overfitting within the AI training regime. The 30-second sampling window was dissected into multiple time steps with a granularity of 0.5 second. The accuracy of these time steps, along with the validation rate, was defined as the likelihood of accurately predicting the P & S phases within these temporal increments during the RNN training phase. The picking uncertainty is characterized as the temporal precision in identifying the P & S arrivals within the training/validation datasets during the RNN training phase. The trajectories of detection/validation accuracy, picking uncertainty, and time step accuracy are graphically represented in
Figure 4.
3.4. Earthquake detection, phase picking, association and location
Utilizing the retrained CNN & RNN models on continuous waveform data, we conducted earthquake detection over a 30-second window with a 15-second sliding step, applying a 1-20 Hz bandpass filter to 24-hour three-component waveforms. P & S phases were concurrently picked within this framework, and surface wave amplitudes were quantified within an amplitude window extending from 1 second pre P-wave to 5 seconds post S-wave. After obtaining the P & S phases for all stations, clustering of phase arrival times is achieved using a threshold of 2.0 seconds for grid search travel time residual, and a requirement of at least 4 stations simultaneously recording the same seismic signal. This process culminated in initial detections comprising 3624 seismic events and 25125 P & S phase arrivals. A juxtaposition of the AI-determined arrival times with those manually picked revealed an average temporal discrepancy for P-waves of 0.02 second (standard deviation of 0.32 second), and for S-waves, an average discrepancy of 0.11 second (standard deviation of 0.44 second).
For absolute earthquake localization, the Hypoinverse software was harnessed. We adopted an average strategy for the initial velocity model, utilizing an averaged three-layer model of Fang et al (2014,
Figure 5)[
19]. Throughout the iterative inversion process, station weights were modulated based on the root mean square of the travel time residuals and epicentral distance. A residual cutoff threshold was established, affording full weight to stations with residuals under 0.3 second, nullifying weights for residuals exceeding three times of the cutoff residual (0.9 second), and implementing weighted interpolation for intermediate values according to a cosine function curve. A distance cutoff was set at 40 km, with a cutoff range spanning 40-120 km.
Subsequent to the absolute localization, relative localization was performed employing the HypoDD algorithm. The double-difference method incorporates an initial relative location derived from travel time measurements, further refined by cross-correlation to correct temporal disparities, thereby augmenting the precision of the relative locations. The parameters included a maximum station-event distance of 150 kilometers, an events pair distance constraint of 6 kilometers, and a minimum of 8 phases per events pair. After two cycles and eight iterations, the inversion parameters, inclusive of travel time residuals, horizontal and vertical discrepancies, were stabilized. During the cross-correlation for travel time difference calculations (cc), the maximum distance between events pairs is set to 4 kilometers, and the maximum epicentral distance for stations is 120 kilometers. Template windows are defined as P-wave before 0.5 second and after 3.5 seconds, and S-wave before 0.3 second and after 4.5 seconds, using a bandpass filter of 2-15 Hz. The velocity model is the same as the one used by Fang et al. (2014,
Figure 5). Finally, we obtained high-precision location results for 3286 events.
The Frequency-Magnitude Distribution (FMD) between the AI catalog of this study and those compiled by Fang et al. (2014) and the China Earthquake Network Center (CENC) was contrasted (
Figure 6). This comparative analysis indicated that the AI-generated catalog exhibits superior detection capabilities relative to the catalogs by Fang et al. (2014) and CENC.