In this section, we propose a raw material proportioning method for rotary hearth furnaces based on online clustering algorithms. The structure of the proposed online clustering proportioning method is shown in
Figure 1. First, we use the LOF algorithm and Kalman filtering algorithm to process the historical data of raw material components. Next, the principle of AP clustering algorithm is described in detail, and it is used for offline learning of preprocessed data to generate machine learning formula library. Then, the online AP clustering algorithm is used to effectively cluster the current data and historical data, so as to update the machine learning formula library. According to the current raw material composition, a preliminary formula is recommended based on the updated the machine learning formula library. Finally, using expert experience-based formula fine-tuning module, the preliminary formula is adjusted to ensure that the final formula meets the performance requirements of the rotary hearth furnace.
3.1. Data Preprocessing
During the continuous operation of the rotary hearth furnace system, some new data are generated. These data mainly include the ratio and it’s corresponding content of raw materials. The elemental composition of these raw material mainly includes Carbon (C), Zinc (), and Iron (). Iron oxide () is the most concerned material component. The differences in the sources of raw materials may lead to differences in the composition content, for example, Iron ore from different regions or suppliers may contain different impurities or element contents. In addition, seasonal fluctuations, such as weather changes, may also affect the composition of raw materials, as they may affect the humidity and temperature during mining or transportation, thereby affecting the properties of raw materials. Furthermore, laboratory errors may also cause data fluctuations, as different laboratories or testing methods may yield different measurement results. In view of these factors, it is particularly important to process the raw material composition data.
Firstly, the element content data of the raw material components is preprocessed. Local outlier factor (LOF) algorithm is used to identify outliers, and linear interpolation method is used to complete the missing value. Kalman filtering technology is employed to filter out measurement noise. Additionally, it is necessary to remove inferior formulas from historical data.
In order to facilitate readers’ understanding of the effectiveness of LOF outlier detection, we take three elements (
C,
,
) in the secondary ash of the blast furnace as an example to visualize the detection results, as shown in
Figure 2, where the circular data represents normal data and the pentagonal represents the anomaly values detected by the algorithm.
In order to demonstrate the effectiveness of filtering algorithms in data preprocessing, we take the
C element in blast furnace secondary ash as an example, as shown in
Figure 3. From
Figure 3, it can be seen that the filtering algorithm can greatly alleviate the steepness of the data, making it smoother.
In industrial environments, If the raw material proportioning scheme can be used for more than h (h>0) hours, it is considered an excellent scheme, and vice versa. In addition, the sum of the percentage of raw materials (excluding Environmental Ash) must meet the condition in Equation (
2), otherwise it is also considered an inferior formula and should be removed.
where
represents the return material quantity;
represents the total quantity of raw materials, exclusive of Environmental Ash;
stands for the Environmental Ash raw material quantity;
indicates the total set material quantity;
represents the material percentage of
in
;
denotes the material percentage of
in
;
indicates the material percentage of
in
, and
represents the material percentage of
in
.
After data preprocessing, a dataset about is constructed that includes information on ratio and ingredient content, where, , , , , , are the corresponding proportions of the mixture ; C is a row vector, specifically represented as C=[, , , , , ], is the element content of C in the material; is a row vector, specifically represented as =[, , , , , ], is the content of element in material ; is a row vector, specifically represented as = [, , , , , ], is the element content of in material ; .
3.2. AP Offline Formula Learning
Historical data hides the valuable operational wisdom of engineers. By using the AP clustering algorithm, we can extract empirical knowledge of human formulas from these historical data.
The basic idea of the AP clustering algorithm is to treat the element content corresponding to N historical ratio data as nodes in the network, and then calculate the clustering centers of each sample through the message passing of each edge in the network. During the clustering process, there are two types of messages passing between nodes, namely attractiveness and belonging . is the attractiveness matrix, and is the belonging matrix. The AP clustering algorithm continuously updates the attractiveness and membership values of each point through an iterative process until m high-quality cluster centers (Exemplar) are generated, where . At the same time, the remaining ratios are assigned to the corresponding clusters, and R and A can be obtained.
The iterative equations for attractiveness and belonging are as follows:
where
is the degree of attraction, which is used to describe the suitability of the element content of ratio
j as the clustering center of the element content of ratio
i,
is is the attribution degree, which is used to describe the appropriate degree of selecting the element content of ratio
j as its clustering center of the element content of ratio
i,
is the similarity, which represents the ability of the element content of ratio
k to serve as the clustering center for the element content of ratio
i.
Generally, negative Euclidean distance is used, so the larger the
, the closer the two points are. In order to avoid oscillations during the iteration process, a soft update strategy is adopted, and the update equations are as follows:
where
is the parameter damping factor,
, and the damping factor is generally taken as 0.5.
After the AP offline formula learning is completed, the formula data can be divided into n clusters based on the mapping of ingredient content to the ratio data. The centers of the clusters are denoted as , where . The system will record the key information of these clusters, including cluster centers, similarity matrices, and attraction matrices. These pieces of information form the foundation of the machine learning formula library, providing basis for optimizing and adjusting future formulas.
3.3. AP Online Formula Updating
In practical rotary hearth furnace systems, data is constantly updated. To tackle these issues related to data updates, there are primarily two operations: re-clustering and incremental clustering. During the continuous operation of the rotary hearth furnace system, the data is increasing, so the amount of data is huge. The cost of re-clustering is huge, which will lead to time-consuming and inefficient calculation. On the contrary, incremental clustering can solve the above problems.
In this study, the incremental AP clustering [
23] is realized by using the affinity propagation clustering combined with the nearest neighbor technology. The nearest neighbor is to establish a connection between the newly added formula data and the existing clustered data sets.
Nearest neighbor techniques mean that the informational content of newly added formula data is configured based on its closest formulas. The strategy is based on the following consideration: if two formulas have similar compositions, their ratios are similar, implying that the data formulas should belong to the same class with identical information. If the newly added formula data does not resemble any of the known clustered groups, a new clustering group will be created.
Given an
dimensional dataset, where the similarity matrix is
, and the corresponding matrices for membership and attractiveness are
and
respectively, the membership and attractiveness values for the newly added formula are expanded according to Equation (
7).
The above
represents the initial number of formulas, where
=
, and similar memberships are expanded according to Equation (
8).
Using the AP online formula learning, the element contents in the current formula data and the historical formula data will be clustered to form clusters. The centers of the clusters are then re-recorded as , where , The system will store information about these clusters, including the cluster center, similarity matrix, and attraction matrix, to construct a machine learning formula library. In machine learning formula library, the recommended formula will be the one corresponding to the cluster center of the current formula data cluster.
To facilitate readers’ better understanding,
Figure 4 shows the process of online AP clustering, where each small circle in the figure represents a formula data point; the gray lines identify the formula data belonging to the same cluster, while the solid circles represent the cluster center of the cluster. The leftmost side of
Figure 4 shows several clusters formed after offline learning using the AP clustering algorithm. The middle figure shows the latest formula data collected at time
t, represented by red circles. If these new data are similar to a certain cluster, they will be classified into this cluster, and the red solid circle represents the newly formed cluster center. If the current formula is not a new cluster center, the formula corresponding to the cluster center of its cluster will be recommended; if the current formula is exactly the cluster center, the formula itself is recommended. The rightmost figure shows the updated results after the online learning process of the AP clustering algorithm at time
. It is observed that if the newly introduced data has a large difference from the existing data, a new cluster will be formed. At this time, the formula corresponding to the cluster center of the newly introduced data will be recommended.
3.4. Expert Experience-Based Formula Fine Adjustment
Through cluster analysis, a machine learning formula library can be constructed. The recommended formula is based on the formula represented by the cluster center of the current formula cluster. This formula usually adapts well to the operating habits and actual production conditions of workers. However, it may not fully meet the performance requirements of rotary hearth furnace. To ensure that the formula meets all performance requirements, it is necessary to introduce a formula fine adjustment module based on expert experience. This module meticulously optimizes and adjusts the recommended formula to meet performance requirements. Firstly, as shown in
Table 2, we provide the following parameters.
Due to the lack of separate assay values for and returned materials, we use the average value of the total raw material quantity "" (excluding Environmental Ash) of its composition for calculation.
The dry basis composition of each element in
is calculated according to the following Equations (
9) and (
10):
where
represents the mass percentage of
in
,
in
,
=(
C,
,
,
,
,
,
O),
represents the mass percentage of material
.
represents the mass percentage of
in material
, where
.
represents the mass percentage of oxygen in
,
denotes
in
and
represents the mass percentage of oxygen in
.
The dry basis composition of each element in
is calculated according to the following Equation (
11):
where
represents the mass percentage of
in
,
in
,
=(
C,
,
,
,
,
,
O),
represents the mass percentage of oxygen in
.
represents
in
, and
represents the mass percentage of oxygen in
.
The dry basis components of each element in returned material
are calculated as shown in the following Equation (
12):
where
represents the mass percentage of
in
,
=(
C,
,
,
,
,
,
O),
represents the mass percentage of oxygen in
.
represents
in
, and
represents the mass percentage of oxygen in
.
Combining Equations (
9)–(
12), we obtain the contents of
C,
,
, and
O in the overall mixture:
where
denotes the content of elements
C,
,
, and
O in the overall mixture, represented as
.
is the mass percentage of total dry basis Carbon,
is the mass percentage of total dry basis oxygen, and
is the total
ratio.
However, due to practical on-site operations and product manufacturing requirements, certain elemental contents need to meet specific upper and lower limits:
In the equation, represent the lower and upper bounds of the mass percentage of Carbon respectively, is the upper bound of the mass percentage of Zinc, is the upper bound of the mass percentage of Chlorine, and is the upper bound of the mass percentage of ratio.
According to the experience of experts, the Carbon content of CDQ is the highest, usually reaching 84.4%, followed by the secondary ash of blast furnace, the content of "
C" is generally 20%, and the content of "
" is 5%. Therefore, in practice, the proportion of these two materials is mainly adjusted to meet the production constraints. In order to meet the requirements of actual production operation, the proportion of materials changed each time is
. In this study, we choose
as 1%. The fine adjustment process is illustrated in
Figure 5.