Artificial Intelligence in Environmental Monitoring: Application of Artificial Neural Networks and Machine Learning for Pollution Prevention and Toxicity Measurements

Preprint

Review

Artificial Intelligence in Environmental Monitoring: Application of Artificial Neural Networks and Machine Learning for Pollution Prevention and Toxicity Measurements

Altmetrics

Downloads

506

Views

589

Comments

Katarzyna Szramowiat-Sala^*

This version is not peer-reviewed

Submitted:

17 July 2023

Posted:

19 July 2023

You are already at the latest version

Alerts

Abstract

Environmental monitoring systems play a crucial role in assessing environmental quality, detecting limits exceedances, and predicting potential ecological episodes. These systems rely on the measurement of various variables at specific locations and time intervals over an extended period. The concept of environmental monitoring encompasses the assessment of health and safety issues for public and environmental health purposes. Pollution of the atmosphere and water, climate change, and natural disasters are among the consequences of continuous industrial and municipal development and human interference in natural ecosystems. To address these challenges and to protect human lives and the environment, with a special concern on mitigating the ecological effects of industrial development, advanced technical solutions, including the technologies associated with artificial intelligence (artificial neural networks ANNs, machine learning ML) have been developed. These technologies offer powerful tools for analysing the vast amount of data collected by monitoring systems and extracting valuable insights. By applying ANNs and machine learning algorithms, environmental monitoring systems can effectively process and interpret the measured variables to assess environmental quality. Despite challenges and limitations, such as data quality and interpretability of AI models, ongoing research and interdisciplinary collaboration are paving the way for the successful implementation of AI in environmental monitoring, ultimately supporting informed decision-making and sustainable resource management.While several review papers have explored the theory of artificial intelligence (AI), here I aim to review the application of ANNs and ML, in environmental aspects, specifically in automotive and industrial emissions toxicity measurements, as well as atmospheric pollution prevention. By examining the potential of AI in these domains, the paper contributes to understanding the role of advanced technologies in environmental monitoring and protection.

Keywords:

Subject: Environmental and Earth Sciences - Pollution

1. Introduction

The environmental protection issues have been still gathering more attention of governments, scientists, and society. Atmosphere and water pollution, climatic changes, and natural disasters are the effect of continuous industrial and municipal development and human interference into natural ecosystems. The progress in science and technology is observed almost everywhere in the world, and currently most of new advanced technical solutions are mainly targeted to protect human lives, to protect the environment and to face the ecological effects of industrial development.

Environmental monitoring systems are set up to define procedures related to tracing of environment quality and to alerting in circumstances of limits exceedances or potential future ecological episodes. From a technical point of view, monitoring systems are based on measurements of a series of variables repeated at one or more locations under prearranged conditions in space and time over a long period [1]. The concept of environmental monitoring is based on the assessment of health and safety problems for public and environmental health purposes. The entire system consists of: (1) inspection and correction action, i.e. measurements and data collection; (2) planning, i.e., setting up of targets and objectives in the aspects of environmental protection; (3) implementation and operation, i.e., setting up of procedures in response to emergencies and alerts; (4) management, i.e., experts advisory board dealing with knowledge of environment and processes in ecosystems (Figure 1).

Artificial intelligence has already found its interest in almost every area of technology. At the same time, it is continuously improved to fully replace a human in activities in the near future, such as cooking, telemarketing, or driving a car. The first mentions of the application of neural networks, which are the most well-known representative of AI, occurred in the scientific literature in the middle of the XX age. However, the authors met many technical limitations when applied in the form and manner that had been used at that time. Interest in AI was raised again in the 1980s of the XX era when a novel concept of non-linearity between input and output signals was introduced [2]. This model, commonly known as the Hopfield model (HM), was introduced by Hopfield in 1982 [3]. The HM consists of some specific states, called neurones, which are fully connected by synaptic weights [4], which in some way reflect the neurobiological theory of the processing of neural signals in the brain. This new approach has created a wave of interest, not only among scientists related to AI, but also in most other scientific areas, like economy or environmental protection.

Many review papers on the theory of artificial intelligence have been found in the literature, i.e. [2,4,5,6,7,8,9,10]. However, here in this paper, the possibility of artificial intelligence in some environmental aspects has been reviewed. Here I focused on artificial neural networks (ANN) and machine learning (ML) and on the application of their different forms in systems of automotive and industrial emissions toxicity measurements, and atmospheric pollution prevention. The aim of this paper is to collect and combine procedures described by other Authors and to extract the most important elements of modelling process. Afterwards, the creation of a unique path of application of AI-augmented models in other problem-solving mechanisms will be possible.

2. Basics of Artificial Neural Networks and Machine Learning

2.1. Artificial Neural Networks

As mentioned above, artificial neural networks had been designed in order to mimic the function of a biological neurone whose task is to process the signals from the whole organism through the dendrites (Figure 2) with a specific synaptic strength through the axons to the brain.

The output signal is a function of input signals from several neurones, which create nets or layers. These input signals are characterized by a weight, which is a real number (Figure 3). The greater the magnitude of the weight, the greater the effect of signal encouragement. The weight values of neurones in a network are significant because they determine the computational properties of the network, and network training is achieved by modifying these weights appropriately [2].

The Hopfield model networks are the simplest structures. They are fully connected neural networks. It means that to obtain the local minimum energy, neural networks are made of many neurones connected with each other and create a kind of spin-glass model. The feedforward neural networks (FFNN) were the first layered neural networks. They consist of one input layer, one hidden layer and one output layer. However, the signal moves just forward. Multilayered backpropagation neural networks (Figure 4) were invented and developed as a further step. The neural networks of this kind are able to train and learn signal processing in forward and back-propagation.

Typical multilayered ANNs have a structure of different layers of neurones, where neurones in each layer are fully connected to the neurones of the next layer. An output signal from a first layer is an input signal for a second layer, etc. The ANNs of this type are found to be sufficient in generalization and classification abilities and, therefore, the learning performance is more effective as well. The quality of classification and prediction in the networks is determined by learning algorithms. What is important, is that the complex topology of NNs does not guarantee the optimal final performance. The more complex the architecture of NNs, the higher are the costs of computational time and energy [4]. Thus, the fully connected (FC) structures of ANNs are recommended to be replaced by sparsely connected graphs such as small-world (SW), scale-free (SF), and random networks.

Due to the fact that ANNs do not require an understanding and detailed knowledge of the processes occurring in the atmosphere or other environmental compartments, the ANNs are a suitable alternative to commonly used computational models, like for real-time air quality forecasting models (RT-AQFs) [11,12]. Models of this kind are sensitive to many factors, such as the scale and quality of the parameters involved in the model, computationally expensive, and dependent on large databases of several input parameters, of which some may not be available [5,13]. However, Russo et al., 2013 [14] advise to implement models with an optimal amount of datasets. The minimisation of data in input layer improves the predictive power of ANNs. On the other site, according to Antanasijević et al., 2013 [15] the wide availability of the input parameters used in ANNs can overcome the lack of data and basic environmental indicators in many countries, which can prevent or seriously hinder the forecasting of particulate matter (PM) emission.

Prediction by a well-trained ANN is usually much faster than conventional simulation programs or mathematical models, as no lengthy iterative calculations are required to solve differential equations using numerical methods; however, the selection of an appropriate neural network topology is important in terms of model accuracy and model simplicity. Gürgen et al., 2018 [16] paid attention; one problem that should be considered in the training procedure is overfitting wherein the obtained ANN memorises the training examples and does not learn the ability to generalise unseen data. Early stopping is one of the practical solutions to this problem.

Cabaneros et al., 2019 [5] reviewed that in the literature there is no clear instruction on how to build neural networks, what kind of ANN model should be applied, and what topology of learning algorithms should be designed for specific goals like chemistry or ecological data. This makes the application of ANNs a bit limited despite that computational software allowing for neural network design is continuously improved and developed. Jakeman et al., 2007 [17] introduced some general guidelines to help with the development of environmental models. Adoption of these outlines by modelers, through fuller execution and reporting of the steps, benefits the model-building community and decision-making about model recommendations. Finally, Cabaneros et al. 2019 [5] outlined the procedures for designing artificial neural architectures for the purposes of the prediction and forecasting of pollutant concentrations. Cabaneros et al. 2019 point out that each step of the overall model development process should be justified explicitly. Moreover, the development of artificial neural networks is problem-specific, and it is not possible to outline one instruction. However, the general protocols, which are summarised in Figure 5, may be useful to modelers who work on other environmental aspects.

2.2. Machine Learning

Machine learning is a set of algorithms which are equipped with a dedicated system serving data of different kinds processing. The computer can be trained and to learn based on data and machine learning methods. ML has been explored in last decades and there were three main technologies trends which fuelled this development [9]:

While sensing and Internet of Things (IoT) technologies have been rapidly advanced, more data amounts were collected;
Access to powerful and affordable computational resources is better than with the design of machine-oriented chips like GPUs (Graphic Processor Units) or TPUs (Tensor Processing Units).
Advanced machine learning algorithms have been developed and validated.

According to Crisci et al., 2012 [8], the ML is a classical but widely studied statistical method for data science. It covers the standard steps of data processing like classification, regression, clustering, density estimation, etc. However, the tools, such as techniques and strategies, characterised by massive algorithms and computational resources applied in big data science with a large number of variables and complex data structures make ML unique.

Generally speaking, the evaluated sample is split into two subsamples: one which is trained – a learning sample, and the one that is evaluated – a test sample. The model is evaluated over the training set, and its performance is studied using the evaluation sample. To reduce any bias due to the random choice of the evaluation sample, the loss is averaged over several random splits of the data (Figure 6) [8].

There are many computational tools within the machine learning concepts which may be applied for ecological data treatment. Crisci et al., 2012 [8] reviewed different algorithms and their mathematical basis and tested them for their own research. The performances of ML model approaches, such as general additives models or classification and regression trees (CART), have been presented and the data obtained were then compared. Moreover, some possible extensions for CART were proposed, like bagging, random forest, boosting, support vector machines, projection pursuit, and nearest neighbours. These extensions were developed mainly due to instability of the CART model with respect to changes in the training sample. The aggregated models, which are created thanks to the extensions, are more stable in training and learning processes. However, the application of these extensions refers to some specific situations dependent on what kind of information do we need to obtain. Generally, ML is a fully developed statistical data analysis method that allows for prediction and forecasting of specific extremes based on ecological data of any kind. Machine learning found its application in building as well, however, the model algorithms need to be still improved [7,9].

3. AI-Augmented Environmental Monitoring Systems vs. Data Security

The utilization of artificial intelligence for automated decision making and predictive analytics, along with the remarkable advancements in sensor technology and robotics, will probably revolutionize the perspectives and responses of individuals, communities, governments, and private entities towards climate and ecological transformations. Pandemic situation has triggered the acceleration of digitization and automation trend in many areas in order to ensure the supply chains. For many years, countries on the whole have world invested much money in technology associated with artificial intelligence, robotics and sensors. China, United States and France are global pioneers in development of AI-based technological applications, including the areas of aquaculture, forestry and agriculture. In the period from 2007 to 2020 they pumped over 7 billion USD into the implementation of AI-augmented technology [18]. Smart cities base on AI-associated technology including traffic management systems, smart policing, lightning control, facial recognition, smart waste disposal systems. Moreover, it turned out that agriculture and forestry are prominent sectors in development and deployment of AI-based systems, for instance for management and control of irrigation systems of farms and plants [18]. In Figure 7 it was showed how looks like the mechanisms of environmental monitoring systems work when AI-augmented technology is incorporated in there. The described flowchart provides a structured framework for environmental monitoring, utilizing artificial intelligence techniques to collect, process, analyse, and interpret data. It allows for informed decision-making, proactive actions, and preventive measures to safeguard the environment and address potential risks or challenges identified through the monitoring process. The data processing stage is crucial for extracting meaningful information from the collected environmental data using artificial intelligence techniques. It involves preprocessing the data, selecting an appropriate model, training and optimizing the model, and utilizing it for prediction and inference tasks. This stage lays the foundation for the subsequent analysis and interpretation of the environmental data to support decision-making and management in environmental monitoring systems.

According to The Royal Society [19], in the near future every controlling system will be fully digitized, creating so called “digital twins” which are a real big data from AI-augmented analysis of ecological data. This digital infrastructure will enable to establish “control loops” across sectors and as a result real data simulation, exploration, optimisation and risk identification will be possible. The trend of digitization and automation forces innovations in IT solutions due to the increasing amount of data which need to be stored. Therefore, the higher computational capacity and larger virtual space are a need. Here I and many authors before identified a huge risk related to data security. In today’s world, personal data protection is a crucial aspect of new technology implementation. Both personal data and ecological data must be secured in order to avoid their releasing. In the application of artificial intelligence in environmental monitoring systems, several threats related to data security have been identified. These threats include:

Proper interpretation of bias in AI algorithms: AI algorithms are designed to learn from data, but if the data used for training is biased, the algorithms can perpetuate and amplify that bias. This can lead to inaccurate or unfair decision-making in environmental monitoring, affecting the integrity and effectiveness of the systems.
Full automation of systems and substitution of humans: While automation can enhance efficiency, relying solely on machines to replace human involvement in environmental monitoring may lead to the neglect of critical human insights, intuition, and judgment. Over-reliance on machines for cost reduction purposes could potentially overlook important contextual factors and compromise the accuracy and effectiveness of monitoring efforts.
Cyberattacks: As AI systems in environmental monitoring become more interconnected and reliant on networked infrastructure, they become vulnerable to cyberattacks. Malicious actors could exploit security vulnerabilities to manipulate or sabotage data, compromise the integrity of monitoring systems, or gain unauthorized access to sensitive information.
Cascading failures: AI systems are complex and interconnected, which means that a failure or error in one component can potentially propagate and lead to widespread consequences. In environmental monitoring systems, a cascading failure in AI algorithms or infrastructure could compromise the accuracy of data analysis, decision-making processes, and ultimately, the effectiveness of monitoring efforts.
Ethical and legal aspects of data storage: The use of AI in environmental monitoring generates vast amounts of data, including potentially sensitive information. The ethical and legal aspects of storing, managing, and securing this data pose challenges. Ensuring privacy, consent, data ownership, and compliance with relevant regulations and frameworks are critical to maintaining trust and safeguarding the rights of individuals and organizations involved.
Overall, addressing these identified threats is crucial for the responsible and effective deployment of AI in environmental monitoring systems, promoting data security, fairness, accountability, and ethical considerations.

4. Application of Artificial Intelligence in Environmental Data Management

4.1. Atmospheric Pollution

Both neural networks and machine learning have found their application in predicting the distribution of pollutants in the atmosphere and forecasting the levels of concentration of pollutants, such as PM10 or ozone in ambient air. Many different algorithms were applied by different authors. Table 1 summarizes the AI models and data preprocessing methods applied in chosen several papers as well as general conclusions which Authors came up with while solving specific problems by application of artificial neural networks or machine learning algorithms. Shortly, each approach undertaken by other Authors was concluded that artificial neural networks and machine learning are very promising methods to track the changes in atmospheric pollution and, thanks to that, to alert the society about possible emergencies. Perez and Reyes, 2006 [20] compared the application of neural networks reflecting the linear and nonlinear models. In their work a three-layer feed neural network as a nonlinear model and a special neural network with no hidden layer as a linear model were applied. Nonlinear models gave better results than linear models in the forecast of air pollution by PM10 from the perspective of 24 h. Antanasijević et al., 2013 [15] came up with similar conclusions. The ANN model has shown very good performance and demonstrated that the forecast of PM10 emission up to two years can be made successfully and accurately and were three times better than the predictions obtained from the conventional multilinear regression and principal component regression models that were trained and tested using the same datasets and input variables. The advantages of few predictors (the multilayer perceptron, radial basis function, Elman network, and support vector machine) combined in the ensemble were also presented by Siwek and Osowski, 2012 [21]. The important advantage of the proposed approach is that it does not require very exhaustive information about air pollutant, reaction mechanisms, and meteorological pollutant sources, and that they have the ability of allowing nonlinear relationships between very different predictor variables.

Interesting results were obtained by applying both the artificial intelligence and statistical methods commonly used for the distribution of PM10 sources. Singh et al., 2013 [22] used principal component analysis (PCA) for PM10 source identification purposes and tree-based ensemble learning models (single decision tree, decision tree forest, decision tree boost) to predict the urban air quality in a city in India. The models successfully predicted the urban ambient air quality and, according to the authors, can be used as effective tools supporting atmosphere management systems.. In the paper by Feng et al., 2015 [23] a novel hybrid model was presented that combines air mass trajectory analysis and wavelet transformation to improve the accuracy of artificial neural network forecasting daily average concentrations of PM2.5 two days in advance. The respective pollutant predictors were used as input to a multilayer perceptron (MLP) type of backpropagation neural network. The significant advantage of this hybrid model was its ability to predict high peaks of PM2.5 concentrations, which are considered very critical factors in air pollution prediction systems.

The paper of Rutherford et al., 2021 [24] is an interesting consideration of application of hybrids of AI and advanced instrumental analytical techniques for the identification of sources of combustion generated PM from combustion. The hybrid algorithm of excitation-emission matrix (EEM) fluorescence spectroscopy and machine learning was developed. To train this model, the PMF source apportionment technique was applied. The EEM-ML approach was successful and moderately successful in predicting vegetative burning and mobile sources. However, the PMF usage for model training did not resolve source categories that would likely be valuable on a global scale such as forest fires, various cookstove and home heating fuels (e.g. biomass, kerosene, LPG, and coal), or diesel versus gasoline exhaust. Using source attribution data that resolved these sources to train an EEM-ML model could allow EEM-ML to apportion these important sources of PM pollution. This was partially achieved by Song et al., 2001 [25] who proposed multivariate calibration based on single particle mass spectral data to apportion the gasoline and diesel generated emissions.

The useful properties of artificial intelligence in air quality prediction and in development of cost-effective control strategies were also confirmed by other authors [20,26,27,28,29].

Table 1. The review of computational methods applied in several chosen previous papers to solve specific problems related to atmospheric pollution.

Specific topic of a reviewed paper	Model applied	Compilation with and/or comparison to other numerical model	Data pre-processing method	Activation function / training algorithm	Loss function and error	General conclusions
Forecasting of PM10 hourly concentration [29]	Multiple-layered perceptrons (MLP) 1) MLP_F (full sets of inputs: PM10 conc. + meteorological variables) 2) GA-MLP (inputs selected by a genetic algorithm optimisation procedure) 3) MLP_N-METEO (inputs excluding meteorological variables)	No approach	Input data were standardized to zero mean and unity deviation	Training algorithm: quasi Newton	Sum of squared errors (SSE) 1) MLP_F: R² ~ 0.7 2) GA-MLP: R² ~ 0.6 3) MLP_N-METEO: R2 < 0.4	A genetic algorithm optimisation procedure allows to select variables with need to be taken as inputs into the NN model. This allows to save computational time.
Forecasting of PM10 maximum episodes in ambient air [20]	1) 3-layered FFNN (as non-linear model) 2) NN without any hidden layer (as a linear layer)	No approach	No information	A sigmoid activation function	Absolute percent errors: 1) 3.6% 2) 5.4%	Neural model gave better results than linear model
Forecasting of annual PM10 emissions [15]	General Regression Neural Network (GRNN)guifeninputs were selected using a general algorithm based on a smoothing factor	Compared to conventional models as multiple linear regression (MLR) and principal components regression (PCR)	No information	A supervised training;	The root mean squared error (RMSE), the normalized mean squared error (NMSE), the mean absolute error (MAE), the correlation coefficient (R²), the index of agreement (IA), the fractional bias (FB)guifenR²_GRNN (0.91-0.94)guifenR²_PCR (0.55-0.94)	A smoothing factor applied in a as a sensitivity analysis tool (the larger the factor for a given inputs is, the more important the input is to the model);guifenBetter forecast performance for artificial neural networks than for classic statistical methods
Forecasting of the daily average concentration of PM10 [21]	MLP, radial basis function network (RBF), Elman recurrent network (EN), support machine vector machine working in regression mode (SVR)	ARX model	No information	No information	The root mean squared error (RMSE), the mean absolute error (MAE), the correlation coefficient (R²), the index of agreement (IA)guifenR2: 0.52 for ARXguifenR2: 0.82-0.95 for NN models compiled with wavelet transformation	Due to the complex relation between PM10 concentration and basic atmospheric parameters influencing the mechanisms of creation and spreading the pollution, PM10 prediction represents nonlinear problem and to obtain the highest accuracy of prediction the nonlinear model should be also applied.
Identifying pollution sources and predicting urban air quality [22]	Three machine learning models: single decision tree, decision tree forest, decision tree boost, support vector machines	PCA	Regression modelling	Bagging and boosting	The root mean squared error (RMSE), the mean absolute error (MAE), the correlation coefficient (R²)guifenR2 ~ 0.9 for all learning ensembles	PCA identified the vehicular emissions and fuel combustion as the major air pollution. All models gave satisfying results and can be used as tools in air quality prediction and management.
Forecasting of PM2.5 [23]	MLP	Air mass trajectory analysis and wavelet transformation	Regression modelling	A sigmoid activation function	MAE, RMSE, IA	The hybrid model of ANN and air mass trajectory analysis and wavelet transformation was applied. The model combined with meteorological forecasted parameters and respective pollutant predictors is considered to be an effective tool to improve the forecasting accuracy of PM2.5.
Identification of sources of combustion generated PM [24]	5-fold-cross-validation for fitting PCR and CNN model	No approach	Regression modelling	Machine learning with Excitation-emission matrix fluorescence spectroscopy was used for model training	Mean squared error as a loss function.guifenR² = 0.745 for mobile sourcesguifenR² = 0.908 for vegetative sources	The EEM-ML approach was mostly successful in predicting vegetative burning and mobile sources

4.2. Automotive Exhaust Toxicity

In a previous work, artificial neural networks with back-propagation learning algorithms were applied to predict gasoline or diesel engine performances like torque, fuel consumption, exhaust toxicity, or exhaust emission, depending on the fuel mixture applied or on a specific material coating elements of a combustion chamber [30]. In some cases, artificial neural networks served as an optimisation tool to suit thermal or mechanical engine parameters for proper exhaust performance, i.e., [31,32,33]. Table 2 summarizes the AI models and data preprocessing methods applied in chosen several papers as well as general conclusions which Authors came up with while solving automotive-related specific problems by application of artificial neural networks or machine learning algorithms.

Vlad et al., 2001 [32] applied Hinging Hyperplane Trees as NN learning algorithms, which allowed to find an acceptable compromise between fuel consumption and emissions. The paper of Nagendra and Khare, 2006 [34] is helpful for young modellers, as in this work the step-by-step procedures of ANN backpropagation learning algorithms design were presented with an application of several meteorological and traffic characteristic variables for the modelling of NO₂ from vehicular exhaust emission. Several configurations of MLP were tested. The results were satisfactory and demonstrated that both meteorological and traffic-related parameters should be included in the ANN algorithms. Canakci et al., 2006 [35] applied the backpropagation learning algorithm with three different variants, single layer, and logistic sigmoid transfer function for the performance and exhaust emission values of a diesel engine powered by biodiesel. To train the network, the average molecular weight, net heat of combustion, specific gravity, kinematic viscosity, C/H ratio, and cetane number of each fuel are used as the input layer, while the outputs were brake specific fuel-consumption (BSFC), exhaust temperature, and exhaust emissions. Similar methods and results were obtained by Parlak et al., 2006 [36], By Togun and Baysec, 2010 [37], by Cay, 2013 [38], by Mehra et al., 2018 [39], or by Gürgen et al., 2018 [16]. Yusaf et al., 2010 [37] used a multilayer perception network for non-linear mapping between input and output parameters to predict a brake power, a torque, BSFC, and exhaust emissions of a modified diesel engine modified to operate with a combination of compressed natural gas (CNG) and diesel fuels. Rezaei et al., 2015 [41] tested two types of ANN, including radial basis function (RBF) and the feedforward to predict different engine performance metrics. Liu et al., 2021 [42] in their work tested different learning algorithms: ANN, random forest, support vector regression, gradient-boosting regression trees to predict the exhaust temperature in heavy-duty natural gas spark ignition engine. Despite the need to fine-tune its hyperparameters, the artificial neural network algorithm proved to be the most suitable. The outcomes revealed that properly trained machine learning models can enhance engine performance, minimize emissions, and extend lifespan, in addition to complementing a sophisticated physical model.

Just like scientists studying air pollution modelling, automotive researchers have also reached similar findings, affirming that a well-trained neural network model delivers prompt and consistent outcomes, rendering it a user-friendly tool for initial investigations into engineering concerns. As a substitute for conventional modelling methods, the utilization of the ANN approach enables precise predictions of internal combustion engine performance, temperature, and various other parameters.

Table 2. The review of computational methods applied in several chosen previous papers to solve specific problems related to automotive exhaust toxicity.

Specific topic of a reviewed paper	Model applied	Compilation with and/or comparison to other numerical model	Data pre-processing method	Activation function / training algorithm	Loss function and error	General conclusions
Identification and optimisation of diesel engine emissions [32]	Hinging Hyperplane Trees (HHT)	No approach	Regression modelling	The Levenberg-Marquardt algorithm	Normalized Root Mean Square Error	Incorporation of proposed model allowed to find a acceptable compromise between fuel consumption and emissions.
Modelling of nitrogen dioxide dispersion from vehicular exhaust emissions [34]	Multilayered NN with different configuration including and excluding meteorological data together with traffic-related data	No approach	No information	Hyperbolic tangent function, training using the supervised algorithm	RMSE, descriptive statistics	Better model performance when both traffic-related and meteorological data were taken into modelling processes
Performance and exhausts emissions of a biodiesel engine [35]	Multilayered NN	No approach	Input data were standardized to zero mean and unity deviation	Back propagation, scaled conjugate gradient, Levenberg-Marquardt algorithms were applied for model training	RMS, R²guifenR² = 0.99 for all emission parameters	The relationship between fuel properties and emitted pollutants can be successfully determined with an usage of artificial neural networks.
Performance and exhausts emissions of a CNG-diesel engine [40]	MLP	No approach	No information	Back propagation for model training, a sigmoid activation function	R² = (0.87-0.99)	Emission performance of an engine was modelled against engine speed (rpm) and the compressed natural gas-to-diesel ratio. Applied model allowed to concluded that a dual fuel CNG-diesel engine gives better brake thermal efficiency and lower emission than diesel engine. ANNs can provide accurate analysis and simulation of the engine performance.
Prediction of emission performance of HCCI engines with oxygenated fuels [41]	FFNN, RBF	No approach	No information	Different training functions were used for model training, i.e. scaled conjugate gradient, Levenberg-Marquardt algorithms and others	RMSE, R²guifenR² = 0.99 for FFNN	Both FFNN and RBF were found to be capable of extracting the relationship between inputs and outputs to predict HCCI engine parameters. FFNN gave better performance, however computational time of RBF was shorter.
Prediction of exhaust gas temperature of a heavy-duty natural gas spark ignition engine [42]	ANN, random forest (RF), support vector regression (SVR), gradient boosting regression trees (GBRT)	1D CFD model	Input data were standardized to zero mean and unity deviation	A sigmoid activation function, a back propagation training algorithms	RMSE, R²guifenR² = 0.90 for ANNguifenR² = 0.84 for RFguifenR² = 0.92 for SVRguifenR² = 0.98 for GBRT	All four deep learning algorithms were able to correctly capture the relationship between key engine control variables and exhaust gas temperature (EGT). The computations resulted in acceptable level of EGT prediction.

4.3. Combustion Processes

The foundation of numerous propulsion and power generation applications lies in the combustion process. One significant issue that impacts combustion is the occurrence and amplification of oscillations in both the heat release of the flame front and the pressure field within the combustion chamber. This phenomenon, known as thermo-acoustic instability, becomes self-sustaining when there is a phase coupling between the two oscillations, resulting in the transfer of energy to the resonant modes of the combustion chamber [43]. Due to the non-linear nature of the phenomenon, relevant problems arise when it is necessary to define model-based control systems. Both in papers of Cammarata et al., 2002 [43] and of Fichera and Pagano, 2006 [44] the application of artificial neural networks for the prediction of combustion instabilities was presented. Satisfactory agreement between the simulated and experimental data was found, and the results show that the model successfully predicted the temporal evolution of thermo-acoustic combustion instabilities. Combustion stability was also the subject of the article by Zhang et al., 2021 [45] who investigated and modelled the combustion process of industrial gases. They applied both neural networks and the NARMAX model which is a popular non-linear computational model. The two models applied altogether gave the most satisfactory results, especially in long-term predictions.

Romeo and Gareta, 2006 [46] and Nikpey et al., 2013 [47] successively applied a multilayer feedforward neural network with a backpropagation algorithm as a monitoring tool for biomass boiler parameters and for micro gas turbine work, respectively. Smrekar et al., 2013 [48] used linear and nonlinear models for the prediction of NO_x emission from coal-fired boilers. The optimization was accomplished by adjusting the operating conditions of the boiler through excess air control, fine-tuning the boiler, and balancing the flow of fuel and air to different burners. Model predictive control (MPC), an advanced control technique, utilized a model that connects operational parameters with NO_x formation as the foundation for reducing NO_x emissions. The use of nonlinear models did not improve the boiler performance. Thus, the authors concluded that static linear models (specifically ARX) are a satisfactory model to predict the NO_x emission. Oko et al., 2015 [49] developed a dynamic model based on NARX neural networks to track the drum boiler and its parameters which are able to dynamically change during the work of a coal-fired power plant. Sun et al., 2016 [50] modelled the pyrolysis products from industrial waste biomass. An optimized three-layer ANN model with a logarithm sigmoid transfer function at the hidden layer and a linear function at the output layer was trained by means of the Levenberg-Marquardt (LM) algorithm. The major gas products of biomass pyrolysis are CO, CO₂, H₂ and CH₄ as well as benzene and light-weight PAHs occurring in different rations depending on the temperature during the process, which was also simulated. Sunphorka et al., 2017 [51], in their study, utilized artificial neural networks (ANN) to establish a connection between biomass components and the kinetic parameters (activation energy, preexponential factor, reaction order) of biomass pyrolysis. They developed three distinct ANN models, each dedicated to one of the kinetic parameters. The relationships between the primary biomass components and the resultant parameters were found to be nonlinear and could be potentially predicted with high accuracy using the selected ANN models (R² > 0.9). Luo et al., 2018 [52] applied ANN to predict product distributions in coal devolatilization and confronted obtained results with effects of modelling using FG-CPD (chemical percolation devolatilization coupled with the functional group). The study reveals that the proposed artificial neural network (ANN) model successfully predicts the precise product distribution of coal devolatilization, exhibiting a strong agreement with experimental data. Moreover, compared to the FG-CPD model, the ANN model delivers enhanced accuracy in predicting both the yield of individual components and their evolution. Additionally, the study evaluates the relative influence of input parameters on the evolution of each devolatilization product. The coal composition is identified as the primary factor, accounting for over 60% of the impact on the distribution of products. Sakiewicz et al., 2020 [53] used artificial neural networks to predict three biomass ash fusion temperatures: initial deformation temperature, hemispherical temperature and flow temperature based on chemical composition of the ash. The applicability of 400 neural network configurations was statistically verified. The multilayer perceptron with 12 inputs representing fractions of the ash compounds, 11 hidden neurones, and three outputs proved to be the optimal neural model configuration for the purposes targeted in this paper.

A very helpful resource for young modellers is the paper of Tuttle et al., 2021 [54] where both a concise review of the literature providing examples of topic areas where the considered modelling methods have been applied and step-by-step procedures for creating learning algorithms were presented. Ten established, data-driven dynamic algorithms were surveyed in this study and the GRU neural network was identified as the best method for modelling combustion emission rate. Li et al., 2021 [55] found an application of ANN in online dynamic prediction of potassium concentration in biomass fuels. They applied a basic recurrent neural network (RNN) and its variants, i.e., long short-term memory neural network (LSTM-NN) and deep recurrent neural network (DRNN). It was found that the prediction of relative error in the K concentration through the use of abovementioned models mentioned above was lower than 10%. Table 3 summarizes the AI models and data preprocessing methods applied in chosen several papers as well as general conclusions which Authors came up with while solving combustion-related specific problems by application of artificial neural networks or machine learning algorithms.

5. Conclusions and Challenges

The range of applications of artificial intelligence is really wide. Most of the reviewed papers presented the successful results of the application of ANN and ML, also in clusters with other models. Many authors pointed out that the effect of overtraining neural networks is the most critical point in the entire neural network approach. The choice of the right topology and the number of variables in the learning algorithms are also key factors to obtain satisfactory results.

The continuous development of artificial intelligence and learning algorithms is seen in the reviewed papers. The younger works, the newer neural network models, or more complex clusters of computational methods are proposed. This is due to the fact that AI is a serious tool which may find its application almost everywhere.

The prediction of the performance of engines or boilers seems to be much easier than forecasting of pollutants in the atmosphere. An internal combustion chamber may be considered as a closed space, and less factors may influence the training procedure while applying artificial intelligence. In the atmosphere, the number of chemical and physical reactions, different transformations, and unpredictable weather phenomena is considerable. For these reasons, as many authors proposed in their work, hybrid models combining artificial intelligence with commonly used computational or statistical methods could give more satisfactory results in the prediction of the distribution of pollutants in ambient air.

Basing on the results presented in previous works, I may conclude that:

AI may be successively applied as a tool supporting the environmental monitoring systems
Multi-layered perceptron networks with backpropagation training (the Levenberg-Marquardt algorithm) seem to be the most frequently used model for training and short-time predictions. Moreover, a sigmoid (hyperbolic or tangent) activation function is mostly used as it is faster and efficient in mapping the nonlinearity among the hidden layer neurones than others.
Designing of ANNs topology with possible highest satisfactory ratios needs many approaches and testing. Overtraining and underfitting of neural network are frequent problems while developing of AI-based models. To avoid both overfitting and underfitting, it is necessary to apply appropriate regularization techniques, such as L1/L2 regularization, early stopping, using proper cross-validation techniques, and adjusting neural network parameters to find the right balance between a model that is too complex and one that is too simple.
In order to train the network on the base of a given data set, a separate algorithm is needed. This can be further applied for process predictions and optimisation, however another algorithms (based on machine learning) should be applied.
Artificial neural networks and machine learning algorithms altogether may be utilized for optimisation of currently uncontrolled residential stoves fuelled by biomass in the aspect of limitation of pollutants emission, with a special concern on the emission which are nonregulated by any directive. This creates a possible gap to fill up and will be the subject of future work of the Author.

In order to continue with further research on optimisation and automation of biomass stoves three significant barriers have been identified to overcome:

How much data is required to train the model properly and effectively?
How to incorporate the existing knowledge about fuel combustion in a chamber of a stove to the learning process of a model in order to improve the algorithm effectiveness and to protect the system from possible failures?
How to sensitize the trained AI-based model to changes of many parameters during combustion process, which are sometimes naturally instable, in order to avoid the situation when an occurring failure is treated as a true input to the network?

The prospects for the development of artificial intelligence (AI) applications in environmental monitoring are highly promising. AI has the potential to revolutionize how to collect, analyse, and interpret environmental data, leading to more effective and efficient monitoring systems. By leveraging AI algorithms, data collection from various sources, enhance data integration and fusion techniques, and enable advanced data analysis, pattern recognition, and anomaly detection can improved. AI can also facilitate predictive modelling, forecasting, and risk assessment in environmental monitoring, aiding in the identification and assessment of environmental risks. However, when it comes to modelling with artificial neural networks (ANNs), there are several key challenges that need to be addressed. These challenges include:

Data Privacy: ANN models often require access to large amounts of data, including sensitive information. Ensuring data privacy and protection is crucial to maintain the confidentiality and integrity of the data. Robust data anonymization techniques, secure data storage, and compliance with relevant privacy regulations are essential to address this challenge.
Expert Interpretation of Results: ANNs can be complex and opaque models, making it challenging for domain experts to interpret and understand the underlying factors driving the model's predictions. The lack of interpretability can hinder trust and acceptance of ANN models in practical applications. Efforts are underway to develop techniques for interpreting and explaining the decisions made by ANNs, such as feature importance analysis and model visualization.
Data Standardization: ANNs rely on high-quality and standardized data for training and validation. In environmental monitoring, data may come from diverse sources, with variations in formats, units, and quality. Ensuring data standardization and normalization is crucial to achieve reliable and accurate ANN models. Establishing data standards, data preprocessing techniques, and quality control measures are necessary to address this challenge.
Limited Data Availability: ANN models require large amounts of labelled data for training, which may not always be readily available in environmental monitoring applications. Limited data can lead to overfitting or underfitting issues, resulting in suboptimal model performance. Techniques such as data augmentation, transfer learning, and active learning can help mitigate this challenge by making the most of the available data and optimizing the training process.
Computational Resources: Training and optimizing ANNs can be computationally demanding, especially for large-scale environmental monitoring applications. Access to sufficient computational resources, such as high-performance computing clusters or cloud-based solutions, is essential to handle the complexity of ANN models efficiently.
Ethical Considerations: The ethical implications of using ANNs in environmental monitoring should be carefully considered. Ensuring fairness, avoiding biases, and addressing potential discriminatory effects of ANN models are crucial. Regular audits, transparency in model development, and ongoing evaluation of the social and environmental impact of ANN applications are necessary to mitigate ethical concerns.

Addressing these challenges requires a multidisciplinary approach, involving experts from the fields of data science, environmental science, ethics, and policy-making. Collaborative efforts are essential to develop robust frameworks, guidelines, and best practices for modelling with ANNs in the context of environmental monitoring, ensuring data privacy, interpretability, standardization, and ethical considerations are appropriately addressed.

Acknowledgments

Research project supported by the programme ‘Excellence initiative – research university’ for the AGH University of Science and Technology. The authors also gratefully acknowledge the funding support given by AGH UST in Krakow within the subvention project no. 16.16.210.476.

Nomenclature

AI	artificial intelligence
ANN	artificial neural network
ARX	auto-regressive model with exogenous inputs
BSFC	brake specific fuel-consumption
CART	classification and regression trees
C/H ratio	carbon-to-hydrogen mass ratio
DRNN	deep recurrent neural network
EEM	excitation-emission matrix
FC	fully-connected
FFNN	feed forward neural network
FG-CPD	chemical percolation devolatilization coupled with the functional group
GPUs	graphic processor units
GRU	gated recurrent unit
HM	Hopfield model
IoT	Internet of Things
LM	the Levenberg-Marquardt algorithm
LPG	liquid petroleum gas
LSTM-NN	long short-term memory neural network
ML	machine learning
MLP	multi-layer perceptron
NARMAX	nonlinear autoregressive moving average with exogenous input
NARX	nonlinear autoregressive with exogenous inputs
NN	neural network
PAHs	polycyclic aromatic hydrocarbons
PCA	principal component analysis
PM	particulate matter
PMF	positive matrix factorisation
RBF	radial basis function
RNN	recurrent neural network
RT-AQF	real-time air quality forecasting model
SF	scale-free network
SW	small world network
TPUs	tensor processing units

References

X. Zhang, K. Shu, S. Rajkumar, and V. Sivakumar, “Research on deep integration of application of artificial intelligence in environmental monitoring system and real economy,” Environ. Impact Assess. Rev., vol. 86, p. 106499, Jan. 2021.
J. Zupan and J. Gasteiger, “Neural networks: A new method for solving chemical problems or just a passing phase?,” Anal. Chim. Acta, vol. 248, no. 1, pp. 1–30, Jul. 1991.
J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities,” Proc. Natl. Acad. Sci., vol. 79, no. 8, pp. 2554–2558, Apr. 1982.
S. Kaviani and I. Sohn, “Application of complex systems topologies in artificial neural networks optimization: An overview,” Expert Syst. Appl., vol. 180, p. 115073, Oct. 2021.
S. M. Cabaneros, J. K. Calautit, and B. R. Hughes, “A review of artificial neural network models for ambient air pollution prediction,” Environmental Modelling and Software, vol. 119. Elsevier Ltd, pp. 285–304, 01-Sep-2019.
D. Lakshmi et al., “Artificial intelligence (AI) applications in adsorption of heavy metals using modified biochar,” Sci. Total Environ., vol. 801, p. 149623, Dec. 2021.
L. Zhang et al., “A review of machine learning in building load prediction,” Appl. Energy, vol. 285, p. 116452, Mar. 2021.
C. Crisci, B. Ghattas, and G. Perera, “A review of supervised machine learning algorithms and their applications to ecological data,” Ecol. Modell., vol. 240, pp. 113–122, Aug. 2012.
T. Hong, Z. Wang, X. Luo, and W. Zhang, “State-of-the-art on research and applications of machine learning in the building life cycle,” Energy Build., vol. 212, p. 109831, Apr. 2020.
T. Zarra, M. G. Galang, F. Ballesteros, V. Belgiorno, and V. Naddeo, “Environmental odour management by artificial neural network – A review,” Environ. Int., vol. 133, p. 105189, Dec. 2019.
Y. Zhang, M. Bocquet, V. Mallet, C. Seigneur, and A. Baklanov, “Real-time air quality forecasting, part I: History, techniques, and current status,” Atmos. Environ., vol. 60, pp. 632–655, Dec. 2012.
Y. Zhang, M. Bocquet, V. Mallet, C. Seigneur, and A. Baklanov, “Real-time air quality forecasting, part II: State of the science, current research needs, and future prospects,” Atmos. Environ., vol. 60, pp. 656–676, Dec. 2012.
A. L. Dutot, J. Rynkiewicz, F. E. Steiner, and J. Rude, “A 24-h forecast of ozone peaks and exceedance levels using neural classifiers and weather predictions,” Environ. Model. Softw., vol. 22, no. 9, pp. 1261–1269, Sep. 2007.
A. Russo, F. Raischel, and P. G. Lind, “Air quality prediction using optimal neural networks with stochastic variables,” Atmos. Environ., vol. 79, pp. 822–830, Nov. 2013.
D. Z. Antanasijević, V. V. Pocajt, D. S. Povrenović, M. D. Ristić, and A. A. Perić-Grujić, “PM10 emission forecasting using artificial neural networks and genetic algorithm input variable optimization,” Sci. Total Environ., vol. 443, pp. 511–519, Jan. 2013.
S. Gürgen, B. Ünver, and İ. Altın, “Prediction of cyclic variability in a diesel engine fueled with n-butanol and diesel fuel blends using artificial neural network,” Renew. Energy, vol. 117, pp. 538–544, Mar. 2018.
A. J. Jakeman, R. A. Letcher, and J. P. Norton, “Ten iterative steps in development and evaluation of environmental models,” Environ. Model. Softw., vol. 21, no. 5, pp. 602–614, May 2006.
V. Galaz et al., “Artificial intelligence, systemic risks, and sustainability,” Technol. Soc., vol. 67, p. 101741, Nov. 2021.
The Royal Society, Digital technology and the planet: Harnessing computing to achieve net zero. 2020.
P. Perez and J. Reyes, “An integrated neural network model for PM10 forecasting,” Atmos. Environ., vol. 40, no. 16, pp. 2845–2851, May 2006.
K. Siwek and S. Osowski, “Improving the accuracy of prediction of PM10 pollution by the wavelet transformation and an ensemble of neural predictors,” Eng. Appl. Artif. Intell., vol. 25, no. 6, pp. 1246–1258, Sep. 2012.
K. P. Singh, S. Gupta, and P. Rai, “Identifying pollution sources and predicting urban air quality using ensemble learning methods,” Atmos. Environ., vol. 80, pp. 426–437, Dec. 2013.
X. Feng, Q. Li, Y. Zhu, J. Hou, L. Jin, and J. Wang, “Artificial neural networks forecasting of PM2.5 pollution using air mass trajectory based geographic model and wavelet transformation,” Atmos. Environ., vol. 107, pp. 118–128, Apr. 2015.
J. W. Rutherford, T. V. Larson, T. Gould, E. Seto, I. V. Novosselov, and J. D. Posner, “Source apportionment of environmental combustion sources using excitation emission matrix fluorescence spectroscopy and machine learning,” Atmos. Environ., vol. 259, p. 118501, Aug. 2021.
X. H. Song et al., “Source apportionment of gasoline and diesel by multivariate calibration based on single particle mass spectral data,” Anal. Chim. Acta, vol. 446, no. 1–2, pp. 327–341, Nov. 2001.
G. De Gennaro et al., “Neural network model for the prediction of PM10 daily concentrations in two sites in the Western Mediterranean,” Sci. Total Environ., vol. 463–464, pp. 875–883, Oct. 2013.
A. Donnelly, B. Misstear, and B. Broderick, “Real time air quality forecasting using integrated parametric and non-parametric regression techniques,” Atmos. Environ., vol. 103, pp. 53–65, Feb. 2015.
H. J. S. Fernando et al., “Forecasting PM10 in metropolitan areas: Efficacy of neural networks,” Environ. Pollut., vol. 163, pp. 62–67, Apr. 2012.
G. Grivas and A. Chaloulakou, “Artificial neural network models for prediction of PM10 hourly concentrations, in the Greater Area of Athens, Greece,” Atmos. Environ., vol. 40, no. 7, pp. 1216–1229, Mar. 2006.
D. Vinay Kumar, P. Ravi Kumar, and M. S. Kumari, “Prediction of Performance and Emissions of a Biodiesel Fueled Lanthanum Zirconate Coated Direct Injection Diesel Engine Using Artificial Neural Networks,” Procedia Eng., vol. 64, pp. 993–1002, Jan. 2013.
S. Lotfan, R. A. Ghiasi, M. Fallah, and M. H. Sadeghi, “ANN-based modeling and reducing dual-fuel engine’s challenging emissions by multi-objective evolutionary algorithm NSGA-II,” Appl. Energy, vol. 175, pp. 91–99, Aug. 2016.
C. Vlad, S. Töpfer, M. Hafner, and R. Isennann, “A Hinge Neural Network Approach for the Identification and Optimization of Diesel Engine Emissions,” IFAC Proc. Vol., vol. 34, no. 1, pp. 399–404, Mar. 2001.
M. Kapusuz, H. Ozcan, and J. A. Yamin, “Research of performance on a spark ignition engine fueled by alcohol–gasoline blends using artificial neural networks,” Appl. Therm. Eng., vol. 91, pp. 525–534, Dec. 2015.
S. M. S. Nagendra and M. Khare, “Artificial neural network approach for modelling nitrogen dioxide dispersion from vehicular exhaust emissions,” Ecol. Modell., vol. 190, no. 1–2, pp. 99–115, Jan. 2006.
M. Canakci, A. Erdil, and E. Arcaklioǧlu, “Performance and exhaust emissions of a biodiesel engine,” Appl. Energy, vol. 83, no. 6, pp. 594–605, Jun. 2006.
A. Parlak, Y. Islamoglu, H. Yasar, and A. Egrisogut, “Application of artificial neural network to predict specific fuel consumption and exhaust temperature for a Diesel engine,” Appl. Therm. Eng., vol. 26, no. 8–9, pp. 824–828, Jun. 2006.
N. Kara Togun and S. Baysec, “Prediction of torque and specific fuel consumption of a gasoline engine by using artificial neural networks,” Appl. Energy, vol. 87, no. 1, pp. 349–355, Jan. 2010.
Y. Cay, “Prediction of a gasoline engine performance with artificial neural network,” Fuel, vol. 111, pp. 324–331, Sep. 2013.
R. K. Mehra, H. Duan, S. Luo, A. Rao, and F. Ma, “Experimental and artificial neural network (ANN) study of hydrogen enriched compressed natural gas (HCNG) engine under various ignition timings and excess air ratios,” Appl. Energy, vol. 228, pp. 736–754, Oct. 2018.
T. F. Yusaf, D. R. Buttsworth, K. H. Saleh, and B. F. Yousif, “CNG-diesel engine performance and exhaust emission analysis with the aid of artificial neural network,” Appl. Energy, vol. 87, no. 5, pp. 1661–1669, May 2010.
J. Rezaei, M. Shahbakhti, B. Bahri, and A. A. Aziz, “Performance prediction of HCCI engines with oxygenated fuels using artificial neural networks,” Appl. Energy, vol. 138, pp. 460–473, Jan. 2015.
J. Liu, Q. Huang, C. Ulishney, and C. E. Dumitrescu, “Machine learning assisted prediction of exhaust gas temperature of a heavy-duty natural gas spark ignition engine,” Appl. Energy, vol. 300, p. 117413, Oct. 2021.
L. Cammarata, A. Fichera, and A. Pagano, “Neural prediction of combustion instability,” Appl. Energy, vol. 72, no. 2, pp. 513–528, Jun. 2002.
A. Fichera and A. Pagano, “Application of neural dynamic optimization to combustion-instability control,” Appl. Energy, vol. 83, no. 3, pp. 253–264, Mar. 2006.
L. Zhang, Y. Xue, Q. Xie, and Z. Ren, “Analysis and neural network prediction of combustion stability for industrial gases,” Fuel, vol. 287, p. 119507, Mar. 2021.
L. M. Romeo and R. Gareta, “Neural network for evaluating boiler behaviour,” Appl. Therm. Eng., vol. 26, no. 14–15, pp. 1530–1536, Oct. 2006.
H. Nikpey, M. Assadi, and P. Breuhaus, “Development of an optimized artificial neural network model for combined heat and power micro gas turbines,” Appl. Energy, vol. 108, pp. 137–148, Aug. 2013.
J. Smrekar, P. Potočnik, and A. Senegačnik, “Multi-step-ahead prediction of NOx emissions for a coal-based boiler,” Appl. Energy, vol. 106, pp. 89–99, Jun. 2013.
E. Oko, M. Wang, and J. Zhang, “Neural network approach for predicting drum pressure and level in coal-fired subcritical power plant,” Fuel, vol. 151, pp. 139–145, Jul. 2015.
Y. Sun, L. Liu, Q. Wang, X. Yang, and X. Tu, “Pyrolysis products from industrial waste biomass based on a neural network model,” J. Anal. Appl. Pyrolysis, vol. 120, pp. 94–102, Jul. 2016.
S. Sunphorka, B. Chalermsinsuwan, and P. Piumsomboon, “Artificial neural network model for the prediction of kinetic parameters of biomass pyrolysis from its constituents,” Fuel, vol. 193, pp. 142–158, Apr. 2017.
K. Luo, J. Xing, Y. Bai, and J. Fan, “Prediction of product distributions in coal devolatilization by an artificial neural network model,” Combust. Flame, vol. 193, pp. 283–294, Jul. 2018.
P. Sakiewicz, K. Piotrowski, and S. Kalisz, “Neural network prediction of parameters of biomass ashes, reused within the circular economy frame,” Renew. Energy, vol. 162, pp. 743–753, Dec. 2020.
J. F. Tuttle, L. D. Blackburn, K. Andersson, and K. M. Powell, “A systematic comparison of machine learning methods for modeling of dynamic processes applied to combustion emission rate modeling,” Appl. Energy, vol. 292, p. 116886, Jun. 2021.
X. Li, C. Han, G. Lu, and Y. Yan, “Online dynamic prediction of potassium concentration in biomass fuels through flame spectroscopic analysis and recurrent neural network modelling,” Fuel, vol. 304, p. 121376, Nov. 2021.
L. Zhang, Y. Xue, Q. Xie, and Z. Ren, “Analysis and neural network prediction of combustion stability for industrial gases,” Fuel, vol. 287, p. 119507, Mar. 2021.

Figure 1. The general mechanism of environmental monitoring systems (source: own preparation on the base on [1]).

Figure 2. The anatomy of human neural cells (source: a photo from www.freepik.com, descriptions – own preparation).

Figure 3. The simplified scheme with mathematical relations of artificial neural network (source: own preparation).

Figure 4. The scheme of multi-layered neural network with backpropagation (source: own preparation).

Figure 5. Decision-making route in designing of artificial neural networks (source: own preparation).

Figure 6. Optimal ML model for training and evaluation (source: own preparation on the base on [8]).

Figure 7. The process of environmental monitoring system with incorporated AI-augmented mechanisms of data processing (source: own preparation).

Table 3. The review of computational methods applied in several chosen previous papers to solve specific problems related to stationary combustion processes.

Specific topic of a reviewed paper	Model applied	Compilation with and/or comparison to other numerical model	Data pre-processing method	Activation function / training algorithm	Loss function and error	General conclusions
Combustion instability [43,44,56]	MLP	NARMAX as a testing set	No information	Back propagation (a Levenberg-Marquardt algorithm) was applied for model training	No information	The NARMAX model implemented using a neural network was effective in long-term predictions. This can be applied as a suitable control solution ensuring the stability of he combustion process basing on the control three main parameters.
Evaluation of boiler behaviour and combined heat and power micro gas turbines [46,47]	FFNN	No approach	No information	Back propagation	real and equation-based monitoring data	The NN can predict a set of operational variables and the fouling state of the boiler. It is also pointed out the NN is a stronger tool for monitoring than equation-based monitoring.
Prediction of NO_x emissions for a coal-based boiler [48]	Random-walk (RW), Auto-regressive with exogenous inputs (ARX), Auto-regressive moving-average with exogenous inputsguifen(ARMAX), Neural networks (NNs), Support vector regression (SVR)	No approach	4-fold cross-validation of the models, data with zero mean and standard deviation	Back propagation	MAE	Results show that the adaptive modelling approach does not significantly improve the NOx prediction. Hence, the recommended model structure for multi-step NOx prediction is a static ARX model with occasional retrainings.
Prediction of drum pressure and level in coal-fired subcritical power plant [49]	NARX NN	No approach	mapminmax and removeconstantrows processing functions	A sigmoid transfer activation function, open loop function	MSE	The results of the validation and testing showed good agreement.
Prediction of pyrolysis products from industrial waste biomass [50]	A 3-layer ANN	No approach	No information	A logarithm sigmoid transfer function at the hidden layer and a linear function at the output layer. Back propagation: the Levenberg-Marquardt (LM) algorithm for model training	MSEguifenR² = 0.99	Three processing parameters, space velocity, reaction temperature, and particle size, were identified as input variables in the model, while the target output variables include selectivity of the four gas products (H₂, CO, CH₄, CO₂). There was fairly good agreement between the experimental results and simulated data for the biomass pyrolysis process.
Prediction of kinetic parametersguifenof biomass pyrolysis from its constituents [51]	Mammalian neural networks	No approach	No information	A hyperbolic tangent sigmoid function (tansig) was used in the hidden layer whilst a linear transfer function (purelin) was used in the output layer. The Levenberg-Marquardt backpropagation algorithm was used for model training.	MSEguifenR² > 0.90	This study applied ANN for constructing the correlation between biomass constituents and the kinetic parameters (activation energy ‘E_a’, pre-exponential factor ‘k₀’ and reaction order ‘n’) of biomass pyrolysis. Three ANN models were developed, one for each of the three kinetic parameters. The relationships between the main biomass components and the output parameters were non-linear and could potentially be predicted by the selected ANN models. The combination of tansig/purelin transfer function provided the lowest mean square error (MSE) in many cases.
Prediction of product distributions in coal devolatilization [52]	FFNN	FG-CPD	No information	A tangent sigmoid activation function was used.	MSE	The results show that the detailed product distributions of coal devolatilization predicted by the proposed ANN model are in good agreement with the experimental data for both the training and validation database, and the ANN model can give a more accurate prediction on both the yield of each component and its evolution compared with the FG-CPD model.
Online dynamic prediction of potassium concentration in biomass fuels [55]	long short-term memory neural network (LSTM-NN) and deep recurrent neural network (DRNN)	No approach	No information	The activation functions of the hidden layer and the fully connected layer are the tanh function and the Leaky ReLU function, respectively	RMSA, MAE, MAPE	It is found that the DRNN and LSTM-NN models have a longer computational time than the RNN. It is thought that the architecture of the DRNN model is more complex than those of the RNN and LSTM-NN models, resulting in a longer computational time.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

MDPI Initiatives

Important Links

Choose an area of interest and we will send you notifications of new preprints at your preferred frequency.

Disclaimer

Artificial Intelligence in Environmental Monitoring: Application of Artificial Neural Networks and Machine Learning for Pollution Prevention and Toxicity Measurements

Abstract

1. Introduction

2. Basics of Artificial Neural Networks and Machine Learning

2.1. Artificial Neural Networks

2.2. Machine Learning

3. AI-Augmented Environmental Monitoring Systems vs. Data Security

4. Application of Artificial Intelligence in Environmental Data Management

4.1. Atmospheric Pollution

4.2. Automotive Exhaust Toxicity

4.3. Combustion Processes

5. Conclusions and Challenges

Acknowledgments

Nomenclature

References

MDPI Initiatives

Important Links

Subscribe