Preprint
Article

Reverse Engineering of Radical Polymerizations by Multi-objective Optimization

Altmetrics

Downloads

101

Views

75

Comments

0

A peer-reviewed article of this preprint also exists.

Submitted:

02 February 2024

Posted:

05 February 2024

You are already at the latest version

Alerts
Abstract
Reverse engineering is applied to identify optimum polymerization conditions for the synthesis of polymer with pre-defined properties. The proposed approach uses multi-objective optimization (MOO) and provides multiple candidate polymerization procedures to reach the targeted polymer property. The objectives for optimization include maximal similarity of molar mass distributions (MMD) compared to the target MMD, minimal reaction time, and maximal monomer conversion. The method is tested for vinyl acetate radical polymerizations and can be adopted to other monomers. The data for the optimization procedure is generated by an in-house developed kinetic Monte-Carlo (kMC) simulator for a selected recipe search space. The proposed reverse engineering algorithm comprises several steps: kMC simulations for the selected recipe search space to derive initial data, performing MOO for a targeted MMD, and identification of the Pareto optimal space. The last step uses a weighted sum optimization function to calculate the weighted score of each candidate polymerization conditions. To decrease the execution time clustering of the search space based on MMDs is applied. The performance of the proposed approach was tested for various target MMDs. The suggested MOO-based reverse engineering provides multiple recipe candidates depending on competing objectives.
Keywords: 
Subject: Chemistry and Materials Science  -   Polymers and Plastics

1. Introduction

Radical polymerizations are known to be very robust and to provide access to a wide range of polymers with largely differing properties, which are defined by the process conditions. The strong correlation between production process and material properties is due to the complex reaction mechanism consisting of a large number of elemental reactions even for homopolymerizations with a single monomer [1]. The kinetics of the elemental reactions are strongly dependent on the process conditions. Therefore, the prediction of a suitable radical polymerization process to obtain a polymer with targeted properties is challenging. To allow for on demand polymer synthesis, at first sight it appears highly attractive to apply simulations of polymerization processes, e.g., employing differential equations [2,3,4] or kinetic Monte-Carlo (kMC) methods [5,6,7,8]. Simulations are particularly valuable, because detailed information on polymer microstructure at each time moment is available, which is not accessible from polymerization processes. However, this type of simulations cannot be run backwards and on demand suggestion of polymerization conditions to obtain a pre-defined polymer is not feasible. To overcome this issue, reverse engineering has the potential to provide several solutions, as opposed to a single one to one relation between polymerization variables and microstructural properties [9]. From the input variables, a polymerization process model predicts the concentration vs. time profiles and the polymer properties. Inverse modeling, on the other hand, is more difficult and calls for optimization strategies. In order to determine the optimal input values for systems with complex reaction mechanisms that provide pre-defined reaction outputs (such as pre-set conversion, yield, and/or other product properties), it was suggested to intelligently explore the reaction condition search space [10]. Further, it was proposed to solve reverse engineering problems using machine learning (ML) based prediction [11], in which ML regression models based on the random forest algorithm and a multivariate and multi-target regression problem [12] were applied. The model took a targeted MMD and predicted the initial polymerization conditions to produce polymer with this targeted property, minimizing the errors in predicted recipes. However, the prediction of the recipe was based on the MMD only, and not optimized with respect to multiple objectives like reaction time or conversion.
Pareto or multi-objective optimization (MOO) [13] is the process of maximizing or minimizing many objective functions while taking a set of constraints into consideration. Numerous scientific domains, such as engineering [14], economics [15], and logistics [16], need the use of MOO when making optimum decisions facing trade-offs between two or more competing objectives. An increasing use of MOO has been seen in chemical engineering [17,18]. In 2009, Fiandaca et al. [19] used genetic algorithm-based MOO to optimize a pressure swing adsorption process based on the maximization of two objectives: nitrogen recovery and nitrogen purity. In 2013, Ganesan et al. [20] carried out MOO of the combined carbon dioxide reforming and partial-oxidation of methane with respect to three objectives. MOO utilized a gravitational search algorithm and particle swarm optimization to tackle the problem. With respect to technical application it appears highly important to solve the reverse engineering problem as an optimization task with multiple contradicting objectives [13]. Various ML-based optimization strategies were addressed for the purpose of reverse engineering polymerization processes [21,22,23]. A genetic algorithm-based optimizer is proposed by Mohammadi et al. [9] to generate a variety of polymerization recipes at random and to send them to the kMC simulator for error evaluation.
This study provides a MOO-based reverse engineering approach, not only ensuring that the targeted MMD is obtained by means of minimizing the mean squared error (MSE), but also providing minimal reaction time and maximal monomer conversion. Frequently, there are several recipes for obtaining similar MMDs, which are referred to as candidate recipes in the following. The solution of the MOO approach is referred to as polymerization recipe, which includes temperature, reaction time as well as initial monomer and initiator concentrations. In the proposed MOO approach, the data for the selection of the optimal recipes from the search space is based on kMC simulations, as previously reported for training ML models [11]. The simulation provides the dependence of monomer conversion and the corresponding MMD on the polymerization conditions and reaction time. The proposed reverse engineering algorithm consists of several steps. First, the kMC simulations are run for the selected recipe search space to derive the MMDs and monomer concentrations as an input for the MOO step. Then, MOO is applied for a given target MMD and the Pareto optimal space is found on the base of the search space. In the last step, a weighted sum optimization function is used to calculate the weighted score of each candidate recipe, which is used for evaluating the solutions. The best candidate has the smallest score. To accelerate the MOO procedure an additional search space clustering on the basis of MMDs is considered. The approach pursued is illustrated in Figure 1. The method is tested for the following model system: chemically initiated vinyl acetate (VAc) radical polymerization model system at 60 °C using tert-butyl peroxypivalate as initiator.

2. Reverse Engineering Modeling Approach

2.1. Model Development

This section provides a formal description of solving a reverse engineering problem with an MOO approach. State of the art MOO methods [24] like genetic algorithms require to generate and evaluate new recipes in each step. Due to the high number of required steps, it is very demanding to perform on-line kMC simulations in loop [9].
Instead, here a recipe search space R is selected, where each recipe rR consists of reaction time t, the initial monomer concentration cm,0, and the initial initiator concentration cini,0. Then, the corresponding monomer concentration cm(r) and molar mass distribution MMD(r) are obtained for each rR via kMC simulation. In the MOO approach, the input is a target MMD, MMDtarget, and the output is a set of optimal candidate recipes R*: r 1 * , r 2 * , …, r n * . R* is a subset of the recipe search space R, R*R with MMD( r i * ) being close to MMDtarget as evaluated on the basis of the MSE as well as the maximal conversion and minimal time.
The optimization variables are presented in Table 1. The lower and upper limits for the variables cmon,0, cini,0, and t are defined by the simulated data. In Table 2 for a specific recipe r, the simulated values cm(r) and MMD(r) are used to calculate the values of three optimization objective functions: objective for the reaction time, ft(r), and objective for the mean squared error (MSE) between MMDtarget and the predicted MMD(r), fMSE. fcm(r) is used to turn the maximization problem of monomer conversion fconv(r) = (cm,0cm(r)) / cm,0 into a minimization problem, in which 1 − fconv(r) is minimized (Table 2).
The final decision takes user preferences into account by assigning specific weights to the objectives applying the weighted sum method [24,25]. Thus, the multi-objective function can be represented in a single-objective way. Then, the values of this function are calculated for each candidate recipe and the recipes with minimal values of the objective function are selected as a set of optimal solutions. A weight wi is assigned to each normalized objective function fi as follows:
min r f r = min r i w i f i r
where i w i = 1 , i ∈ {MSE, cm, t}, rR, and R is a polymerization recipe space. For clarity of presentation, the weight wcm of the objective fcm is also referred to as the weight of the conversion objective. If required the number of the objectives can be increased, e.g. the conversion of initiator can be added.
The steps of the proposed algorithm are presented in Figure 2 as direct approach. First, a search space RSR is selected (for details see Section 2.2) and for each rRS MMD(r) and cm(r) are obtained via kMC simulations. Then, MOO is performed over RS. First, the objective function values are calculated as follows: the values for the objective ft(r) are already included in r as t, the values of the objective fcm(r) are calculated in advance for all possible r according to Table 2, and the values of the objective fMSE are specified by MMDtarget. Further, based on the calculated objective function values, the Pareto front points RparRS leading to MMDtarget are identified. For this, the points from RS are represented in the Pareto optimal space, with coordinates specified by the three objectives. The Pareto front points are found in this Pareto optimal space, such that one value of the objective function cannot be improved without downgrading the value of another objective function. Finally, for the Pareto front points Rpar, the weights of each objective function are defined and a set of the best recipe candidates R*Rpar according to Eq. 1 is selected.
The above-described algorithm is improved with respect to the optimization time by means of clustering the search space, which is illustrated in Figure 2 as the clustering-supported approach. Clustering divides the search space RS into a number of clusters as illustrated in Figure 3. First, the search space RS is clustered on the base of MMD(r), which allows for selecting a cluster RtargetRS containing the MMDs, which are the closest to MMDtarget. In general, a larger number of clusters leads to a smaller number of MMDs per cluster gaining higher similarity of the distributions. However, there is less space for optimization regarding other objectives, e.g., such as polymerization time and monomer conversion. For this reason, an appropriate trade-off between the number of clusters and its size has to be identified. Upon appropriate clustering a cluster for the target MMD (Figure 2, red arrows) Rtarget is found. Then, by MOO the search space is reduced to the number of Pareto front points RparRtarget. Finally, after defining objective weights, the best recipe candidates R*Rpar are found according to Eq. 1. Since MOO is applied to a single cluster RtargetRS, which is considerably smaller than RS, the optimization time is significantly reduced.
Different methods can be applied for clustering the search space on the basis of MMD. Clustering of distributions and their representation in histograms is an important topic, which attracted a lot of attention, because of specific metrics, which should be used to compare the distributions. One of the most popular and fast clustering methods is the kMeans method [26]. A modified kMeans clustering algorithm was applied to the clustering of histograms [27]. Further, a novel non-parametric clustering algorithm of empirical probability distributions was proposed [28]. Here, the classical kMeans clustering method was used. This algorithm starts with a random separation of the MMDs into clusters. At each step, it recalculates the centroids of each cluster and relocates the data points to the new centroids. The clustering process finishes when the clusters are stable or the given number of iterations is reached. In this study, the simplest Euclidean distances are used for calculation of the distances between multi-dimensional data points, while specific metrics for clustering of distributions are also available [27,29].
There are different strategies for the data generation for the MOO procedure: use of exclusively in advance generated data from kMC simulation (Figure 2, blue arrows), on demand kMC-generated data, ML-generated data, kMC-based and ML-generated hybrid data sets, etc. Currently, as a first step, the focus is exclusively on the use of kMC-simulated data.

2.2. Data acquisition and Processing

The in-house developed kMC simulator mcPolymer was used to carry out the simulations required for the generation of polymerization data according to the search space [30]. The simulator allows for exporting the concentration profiles of all reactants and products as well as microstructural data like MMD, chain composition and branching of all polymeric species involved in the process. The simulator output was adapted to be well-machine readable. The data were filtered, further abstracted, logically connected, and stored in the well-structured no-SQL database MongoDB. The kMC simulated MMDs and monomer concentrations were obtained for the selected search space RS, allowing for MOO and subsequent finding of the weighted optimal solution.
The kMC simulations were performed for radical polymerizations with VAc as monomer, tert. butyl peroxypivalate as initiator, and methanol as solvent. The simulations are based on a full kinetic model for the VAc radical polymerization containing all elemental reactions [31]. The following polymerization conditions were used: constant temperature of 60 °C, cini,0 in the range of 1.0 to 20.0 mmol·L−1, and cm,0 in the range of 2.0 to 5.0 mol·L−1 with uniformly distributed grid size of cini,0 (geometrically scaled grid points) and cm,0 (arithmetic scaled grid points), resulting in 225 simulations of the process. The geometric scale was selected for cini,0 to put more attention on the small values of this parameter. The polymerization process was simulated for a constant reaction time of 6 hours and the properties of interest were recorded every 20 minutes, thus, obtaining in total 18 data points at different time moments for each investigated property. Thus, the data set contains 4050 different MMDs.
For training the ML prediction models for single-objective reverse engineering the obtained data set was divided into training and test set in proportion of 80:20. The same data was used for MOO, again taking 80 % of the data as training set for the search space RS and 20 % of the data as test set Rtest. The test set Rtest contains 810 recipes, which correspond to a set MMDtest consisting of 810 kMC-simulated MMDs. The evaluation of all optimization approaches was performed with MMDtest with each element serving as MMDtarget. In order to test the MMO approach a single MMD is selected from MMDtest and used as MMDtarget for the MMO approach. The performance of the MMO is evaluated by testing it with every MMD from MMDtest.

3. Results and Discussion

In order to compare results obtained via the previously described ML modeling-based reverse engineering strategy [11] with data from the MOO approach introduced in this work, the technique reported for butyl acrylate was adopted for the polymerization of vinyl acetate. Previously, it was described how an ML regression model provides a recipe r = (cm,0, cini,0, T) for a fixed time for a given MMDtarget. cm,0 and cini,0 are initial concentrations of the monomer and the initiator, respectively, and T is the polymerization temperature. The prediction was performed with the random forest method for a target MMD to minimize the errors in the predicted recipe r(MMD). In the case of VAc polymerizations, the temperature is kept constant at T = 60 °C and the reaction time t was predicted by the model: r = (cm,0, cini,0, t). The reverse engineering result for a sample MMDtarget presented in Figure 4 corresponds to the recipe (cm,0; cini,0; t)/ (2.0 mol/L; 1.9 mmol/L; 140 min). Only very small differences between the target and predicted MMD as well as the predicted recipe are seen. The overall performance of the developed model was evaluated using the R2 determination coefficient, which is a measure for the quality of the prediction by the ML model. Keeping in mind that R2=1 indicates perfect prediction, it is remarkable to note that a rather low value of 0.78 for R2 is obtained while visual inspection of the MMDs for the representative example given in Figure 4 indicates only minor differences. However, this ML-based approach does not consider multiple contradicting optimization objectives and does not provide multiple alternative solutions as proposed in the Pareto optimization.

3.1. Direct Pareto Optimization

Figure 5A shows all Pareto front points with three objectives fi, i ∈ {MSE, cm, t}. After definition of objective weights, the optimal solution from the Pareto front points is selected, which satisfies the weighted score calculated with Eq. 1 best. Several combinations of objective weights are considered: time focused, conversion focused, MSE focused. For example, in the time focused case, the weight of time is higher than the other weights. To identify the optimal recipe leading to the target MMD, the MSE should have a higher weight than the other objectives. To reduce the number of Pareto front points an MSE limit is chosen. As an example, the data in Figure 3 was obtained with an MSE limit of 2∙10−3.
The optimization procedure is illustrated in Figure 5 for an example target MMD, which corresponds to the recipe (cm,0; cini,0; t) / (2.0 mol/L; 1.9 mmol/L; 140 min). Rather than presenting the objective fcm, which represents the minimized monomer concentration, the conversion is shown in Figure 5A,B. Figure 5B demonstrates the reduction of the number of Pareto front points from 340 points (A) to 46 points (B) by filtering with the MSE limit of 2∙10−3. The MMDs of all Pareto front points and of the filtered Pareto front points are given in Figure 5C,D, respectively. The filtering approach allows for considering only MMDs of similar shape. Figure 6 shows the impact of different MSE limits on the number of Pareto front points. Even with an MSE limit of 10−3 the average number of points is around 30, which is a relevant number for consideration of other objectives. The minimal number of filtered Pareto points for the MSE limit of 10−3 is about 10.
Figure 7 allows for comparison of MSEs obtained via ML prediction-based (left) and MOO-based approaches (right) for reverse engineering calculated for all elements of MMDtest. Minimal MSEs are preferable. For the ML-based prediction (left), the MSE values of almost all MMDtarget from MMDtest are less than 10−3, while most MSEs obtained by the MOO-based approach with an MSE limit of 2∙10−3 are by two orders of magnitude lower. Most values are less than 10−5.
The next step is to evaluate the candidate recipes based on the weighted sum of the objective values according to Eq. 1. Figure 8 and Table 3 present the candidate recipes for the different combinations of objective weights indicated. In Figure 8, a color code is assigned to the individual candidate recipes according to the score of the weighted sum of the values of the three objective functions fMSE, fcm, ft calculated according to Eq. 1. For the weights wMSE = 0.9, wcm = 0.02, and wt = 0.08 in Figure 8A the focus is on the MSE. As a minimization problem is solved, a minimal score value (visualized by big pink points) determines the best candidates, given by the recipes with IDs 0, 15, 16 in Table 3. Other weight combinations with focus on the monomer conversion in Figure 8B and with equal weights in Figure 8C demonstrate, which recipe candidates are selected depending on specific requirements.

3.2. Clustering Supported Pareto Optimization

The clustering-supported optimization is performed for the same MMDtarget as in the previous section. It is assumed that all points contained in the target cluster Rtarget have satisfactory MSEs. For this reason, only the two objectives fcm and ft have been chosen to be optimized. Figure 9 shows Rtarget in the coordinate space, which corresponds to the objective functions. The Pareto front points RparRtarget are indicated by colored markers. Table 4 presents all Pareto front points with the corresponding recipes and objective weights.
To find the weighted solution for a time focused result with wcm = 0.2 and wt = 0.8 the candidate with ID 1 reaching a conversion of 47 % at a reaction time of 40 min is best. If conversion is considered to be more important than reaction time as in the case of wcm = 0.8 and wt = 0.2, the candidate with ID 4 is the best solution leading to a conversion of 77 % with a reaction time of 100 min.
MMD clustering focuses only on a single objective function fMSE when identifying the most similar cluster for the target MMD. The accuracy is determined on the basis of the MSE, which represents the deviation of the selected solution from MMDtarget and depends on the cluster size, and consequently, on the number of clusters. For a small number of clusters, every cluster contains a broader spectrum of MMDs as compared to a large number of clusters and leads to an increase in the MSEs of MMDtarget and the MMDs contained in this cluster. However, the cluster covers a larger value range of the objective function, allowing for a broader range of predicted recipes. Figure 10 depicts how the MMDtarget can be reached for clustering the search space into 20, 40, and 60 clusters, respectively. Figure 10A–C shows the points of the target clusters in the objective function space. The shape of the clusters is similar. As the number of clusters grows, the number of candidates per cluster decreases, because there are fewer points fitting the required accuracy with respect to the MMD. Therefore, the size of the whole cluster shrinks. The red points in Figure 10 mark the Pareto front points, from which the optimal candidates are selected.
Figure 10D–F shows the MMDs for all Pareto front points. With increasing number of clusters, the MSE of MMDtarget and Pareto front MMDs decreases. The objective function value range also decreases as illustrated in Table 5. For the example, given fMSE of MMDtarget improves by a factor of 100 when going from a total number of 20 clusters to a total number of 60 clusters. This improvement is achieved by a loss in conversion, which is reduced from 58 % to 27 %. In this example, reaction time is also lowered with increasing number of clusters. The last two rows in Table 5 provide the scores calculated by Eq. 1 for the best recipes obtained with clustering of the search space into 20, 40, and 60 clusters for two cases with a different number of objectives. Note, that only the scores obtained with the same combination of weights (given in a single row of Table 10) are comparable. For equal weights of two objectives (wcm = 0.5, wt = 0.5) the minimal score of 0.39 was obtained with clustering into 40 clusters. Although the optimization has been carried out within one cluster using only two objectives, the results can be related to the direct Pareto optimization, including the MSE objective. It is shown that for equal weight of three objectives (wMSE = 1/3, wcm = 1/3, wt = 1/3) the minimal score of 0.30 was also obtained with the same clustering into 40 clusters.
The comparison of the direct and the clustering-supported Pareto approach is based on the MSE limit. For the direct approach, the MSE limit can be chosen by the operator. For the clustering-supported approach, the individual MSE limit for a given MMDtarget, which is evidently depending on the number of clusters, is defined as the maximal MSE value for all MMDs contained in Rtarget.
Both approaches were compared on the basis of MMDtest. Then, for the clustering supported approach, the MSE limit is defined as the maximal value of all individual MSE limits over MMDtest. Figure 11 shows the dependency of the MSE limit on the total number of clusters. For more than 30 clusters, the MSE limit is almost constant at a level of 0.003. The MSE limit increases rapidly for less than 30 clusters.
Figure 12A illustrates the average number of Pareto front points depending on the MSE limit. Increasing the MSE limit results in a reduction in the number of clusters (see Figure 11) associated with an enlargement in cluster size, and thereby leading to a higher count of Pareto front points. The clustering-supported approach is more selective considering the average number of Pareto front points. The standard deviation is used to describe the value ranges of the objective function and of the MSE function. Wider ranges are advantageous, because they provide more space for optimization. Averaging the value ranges of each objective function over MMDtest yields the average ranges illustrated in Figure 12B–D. The value ranges of the reaction time and conversion are wider (Figure 12C,D) for the clustering supported approach, while the MSE value range is more restricted (Figure 12B). The findings indicate that the clustering approach allows for a better optimization of monomer conversion and reaction time while the direct approach is better suited for optimization of the MSEs.
Table 6 presents the execution time for both approaches for the test set MMDtest consisting of 810 target MMDs. For the clustering-supported approach, the optimization time depends slightly on the number of clusters. However, it is by a factor of ten smaller than for the direct approach.
The rapidness of the MOO algorithm is especially important, for the next step, when in future genetic algorithms together with ML models will be applied to extend the search space for the polymerization parameters, because Pareto optimization should be executed many times for each evolution stage of genetic algorithms. In this situation, the clustering supported Pareto optimization is preferable.

4. Conclusions and Outlook

The study bridges the gap between multi-objective optimization (MOO) and its application for reverse engineering of polymerization processes. The proposed MOO models allow for simulation-supported determination of an optimal recipe for a targeted molar mass distribution, taking multiple objectives, e.g., minimal reaction time, maximal conversion of monomer, and similarity of the found MMD compared to the target MMD into consideration. The proposed approach can be accelerated by an additional clustering of the simulated MMDs. Moreover, it is possible to obtain a number of suitable candidates considering different weights of the selected objectives. A set of alternative optimal solutions is obtained for each specific combination of weights. First insights are provided on the formulation of a polymerization reverse engineering task as a MOO problem. In future, the understanding gained in combination with previously proposed ML models [11] for polymerizations will be applied to facilitate and to speed-up finding of optimal solutions with a limited number of kMC simulations. ML models can be used to set-up the search space and to evaluate the candidate polymerization procedures rather than using kMC simulations. The search for candidate solutions will be performed with the help of genetic algorithms. Moreover, the consideration of more complex microstructural details, e.g., such as branching in high-temperature polymerization of acrylate [32], may require to increase the number of objectives in MOO.

Data Availability Statement

The data presented in this study are openly available in Mendeley Data at doi:10.17632/hdrwmxgx27.1 with reference: https://data.mendeley.com/preview/hdrwmxgx27?a=d024c5aa-a100-473f-b641-bc0d3f11ebb5.

Acknowledgments

This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – 466601458 – within the Priority Programme “SPP 2331: Machine Learning in Chemical Engineering”.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Ballard, N.; Asua, J.M. Radical polymerization of acrylic monomers: An overview. Prog. Polym. Sci. 2018, 79, 40–60. [Google Scholar] [CrossRef]
  2. Meszena, Z.G.; Johnson, A.F. Modelling and simulation of polymerisation processes. Comput. Chem. Eng. 1999, 23, 375–378. [Google Scholar] [CrossRef]
  3. Saldívar-Guerra, E. Numerical Techniques for the Solution of the Molecular Weight Distribution in Polymerization Mechanisms, State of the Art. Macromol. React. Eng. 2020, 14, 2000010. [Google Scholar] [CrossRef]
  4. Gómez-Reguera, J.A.; Vivaldo-Lima, E.; Gabriel, V.A.; Dubé, M.A. Modeling of the Free Radical Copolymerization Kinetics of n-Butyl Acrylate, Methyl Methacrylate and 2-Ethylhexyl Acrylate Using PREDICI®. Processes 2019, 7, 395. [Google Scholar] [CrossRef]
  5. Trigilio, A.D.; Marien, Y.W.; Van Steenberge, P.H.M.; D’hooge, D.R. Gillespie-Driven kinetic Monte Carlo Algorithms to Model Events for Bulk or Solution (Bio)Chemical Systems Containing Elemental and Distributed Species. Ind. Eng. Chem. Res. 2020, 59, 18357–18386. [Google Scholar] [CrossRef]
  6. Peikert, P.; Pflug, K.M.; Busch, M. Modeling of High-Pressure Ethene Homo- and Copolymerization. Chem. Ing. Tech. 2019, 91, 673–677. [Google Scholar] [CrossRef]
  7. Trigilio, A.D.; Marien, Y.W.; Van Steenberge, P.H.M.; D’hooge, D.R. Toward an Automated Convergence Tool for Kinetic Monte Carlo Simulation of Conversion, Distributions, and Their Averages in Non-dispersed Phase Linear Chain-Growth Polymerization. Ind. Eng. Chem. Res. 2023, 62, 2583–2593. [Google Scholar] [CrossRef]
  8. Brandão, A.L.T.; Soares, J.B.P.; Pinto, J.C.; Alberton, A.L. When Polymer Reaction Engineers Play Dice: Applications of Monte Carlo Models in PRE. Macromol. React. Eng. 2015, 9, 141–185. [Google Scholar] [CrossRef]
  9. Mohammadi, Y.; Saeb, M.R.; Penlidis, A.; Jabbari, E.; J Stadler, F.; Zinck, P.; Matyjaszewski, K. Intelligent Machine Learning: Tailor-Making Macromolecules. Polymers 2019, 11, 579. [Google Scholar] [CrossRef]
  10. Mohammadi, Y.; Penlidis, A. Polymerization Data Mining: A Perspective. Adv. Theory Simul. 2019, 2, 1800144. [Google Scholar] [CrossRef]
  11. Fiosina, J.; Sievers, P.; Drache, M.; Beuermann, S. Polymer reaction engineering meets explainable machine learning. Comput. Chem. Eng. 2023, 177, 108356. [Google Scholar] [CrossRef]
  12. Spyromitros-Xioufis, E.; Tsoumakas, G.; Groves, W.; Vlahavas, I. Multi-target regression via input space expansion: treating targets as inputs. Mach. Learn. 2016, 104, 55–98. [Google Scholar] [CrossRef]
  13. Ngatchou, P.; Zarei, A.; El-Sharkawi, A. Pareto Multi Objective Optimization. In Proceedings of the 13th International Conference on Intelligent Systems Application to Power Systems, 2005. 13th International Conference on, Intelligent Systems Application to Power Systems, Arlington, Virginia, USA, 06–10 Nov. 2005; IEEE / Institute of Electrical and Electronics Engineers Incorporated. 2005; pp. 84–91, ISBN 1-59975-174-7. [Google Scholar]
  14. Gunantara, N. A review of multi-objective optimization: Methods and its applications. Cogent Eng. 2018, 5, 1502242. [Google Scholar] [CrossRef]
  15. Doumpos, M.; Zopounidis, C. Multi-objective optimization models in finance and investments. J. Glob. Optim. 2020, 76, 243–244. [Google Scholar] [CrossRef]
  16. Gupta, S.; Haq, A.; Ali, I.; Sarkar, B. Significance of multi-objective optimization in logistics problem for multi-product supply chain network under the intuitionistic fuzzy environment. Complex Intell. Syst. 2021, 7, 2119–2139. [Google Scholar] [CrossRef]
  17. Bhaskar, V.; Gupta, S.K.; Ray, A.K. APPLICATIONS OF MULTIOBJECTIVE OPTIMIZATION IN CHEMICAL ENGINEERING. Rev. Chem. Eng. 2000, 16, 1–54. [Google Scholar] [CrossRef]
  18. Murugan, C.; Subbaian, S. Multi-Objective Optimization for Enhanced Ethanol Production during Whey Fermentation. In 2022 International Conference on Power, Energy, Control and Transmission Systems (ICPECTS), Chennai, India, 08–09 Dec. 2022; IEEE, 2022; pp 1–7, ISBN 978-1-6654-6275-4.
  19. Fiandaca, G.; Fraga, E.S.; Brandani, S. A multi-objective genetic algorithm for the design of pressure swing adsorption. Eng. Optim. 2009, 41, 833–854. [Google Scholar] [CrossRef]
  20. Ganesan, T.; Elamvazuthi, I.; Ku Shaari, K.Z.; Vasant, P. Swarm intelligence and gravitational search algorithm for multi-objective optimization of synthesis gas production. Appl. Energy 2013, 103, 368–374. [Google Scholar] [CrossRef]
  21. Charoenpanich, T.; Anantawaraskul, S.; Soares, J.B.P. Using Artificial Intelligence Techniques to Design Ethylene/1-Olefin Copolymers. Macromol. Theory Simul. 2020, 29, 2000048. [Google Scholar] [CrossRef]
  22. Dragoi, E.N.; Curteanu, S. The use of differential evolution algorithm for solving chemical engineering problems. Rev. Chem. Eng. 2016, 32, 149–180. [Google Scholar] [CrossRef]
  23. Fernandes, F.A.N.; Lona, L.M. Neural network applications in polymerization processes. Braz. J. Chem. Eng. 2005, 22, 401–418. [Google Scholar] [CrossRef]
  24. Konak, A.; Coit, D.W.; Smith, A.E. Multi-objective optimization using genetic algorithms: A tutorial. Reliab. Eng. Syst. Saf. 2006, 91, 992–1007. [Google Scholar] [CrossRef]
  25. Marler, R.T.; Arora, J.S. The weighted sum method for multi-objective optimization: new insights. Struct. Multidisc. Optim. 2010, 41, 853–862. [Google Scholar] [CrossRef]
  26. Hartigan, J.A.; Wong, M.A. Algorithm AS 136: A K-Means Clustering Algorithm. Appl. Stat. 1979, 28, 100–108. [Google Scholar] [CrossRef]
  27. Nielsen, F.; Nock, R.; Amari, S. On Clustering Histograms with k-Means by Using Mixed α-Divergences. Entropy 2014, 16, 3273–3301. [Google Scholar] [CrossRef]
  28. Henderson, K.; Gallagher, B.; Eliassi-Rad, T. EP-MEANS: An Efficient Nonparametric Clustering of Empirical Probability Distributions. In Proceedings of the 30th Annual ACM Symposium on Applied Computing. SAC 2015: Symposium on Applied Computing, Salamanca Spain, 13–17 Apr. 2015; Wainwright, R.L., Corchado, J.M., Bechini, A., Hong, J., Eds.; ACM: New York, NY, USA, 2015; pp 893–900, ISBN 9781450331968.
  29. Banerjee, A.; Merugu, S.; Dhillon, I.S.; Ghosh, J. Clustering with Bregman Divergences. J. Mach. Learn. Res. 2005, 6, 1705–1749. [Google Scholar]
  30. Drache, M.; Drache, G. Simulating Controlled Radical Polymerizations with mcPolymer—A Monte Carlo Approach. Polymers 2012, 4, 1416–1442. [Google Scholar] [CrossRef]
  31. Feuerpfeil, A.; Drache, M.; Jantke, L.-A.; Melchin, T.; Rodríguez-Fernández, J.; Beuermann, S. Modeling Semi-Batch Vinyl Acetate Polymerization Processes. Ind. Eng. Chem. Res. 2021, 60, 18256–18267. [Google Scholar] [CrossRef]
  32. Mätzig, J.; Drache, M.; Drache, G.; Beuermann, S. Kinetic Monte Carlo Simulations as a Tool for Unraveling the Impact of Solvent and Temperature on Polymer Topology for Self-Initiated Butyl Acrylate Radical Polymerizations at High Temperatures. Macromol. Theory Simul. 2023, 32, 2300007. [Google Scholar] [CrossRef]
Figure 1. Illustration of the polymerization reverse engineering approach by means of MOO.
Figure 1. Illustration of the polymerization reverse engineering approach by means of MOO.
Preprints 98043 g001
Figure 2. Direct Pareto optimization vs. clustering supported Pareto optimization.
Figure 2. Direct Pareto optimization vs. clustering supported Pareto optimization.
Preprints 98043 g002
Figure 3. Illustration of the clustering approach of the search space. In the middle two exemplary clusters of MMDs are presented, on the right all clusters defined are given.
Figure 3. Illustration of the clustering approach of the search space. In the middle two exemplary clusters of MMDs are presented, on the right all clusters defined are given.
Preprints 98043 g003
Figure 4. An example of reverse engineering with ML modeling comparing the MMD (left) and the recipe (right) for a single MMDtarget, with MMDtarget being an element of MMDtest. The target recipe is the recipe associated with MMDtarget.
Figure 4. An example of reverse engineering with ML modeling comparing the MMD (left) and the recipe (right) for a single MMDtarget, with MMDtarget being an element of MMDtest. The target recipe is the recipe associated with MMDtarget.
Preprints 98043 g004
Figure 5. Pareto front points (A), filtered Pareto front points (B), MMDs for all Pareto front points (C), and MMDs for filtered Pareto front points (D). The target MMD is given in bold black line, the MSE limit was set to 2∙10−3.
Figure 5. Pareto front points (A), filtered Pareto front points (B), MMDs for all Pareto front points (C), and MMDs for filtered Pareto front points (D). The target MMD is given in bold black line, the MSE limit was set to 2∙10−3.
Preprints 98043 g005
Figure 6. The influence of the MSE limit on the distribution of the number of filtered Pareto front points based on MMDtest.
Figure 6. The influence of the MSE limit on the distribution of the number of filtered Pareto front points based on MMDtest.
Preprints 98043 g006
Figure 7. MSEs of the MMDs of the recipes obtained by ML prediction based (left) and MOO (right) approaches on the basis of MMDtest.
Figure 7. MSEs of the MMDs of the recipes obtained by ML prediction based (left) and MOO (right) approaches on the basis of MMDtest.
Preprints 98043 g007
Figure 8. Influence of the combination of objective weights on the selection of the optimal recipes for a sample MMDtarget: time focused (wMSE = 0.05, wcm = 0.05, wt = 0.9) (A), monomer conversion focused (wMSE = 0.01, wcm = 0.8, wt = 0.19) (B), equal weights (wMSE = 1/3, wcm = 1/3, wt = 1/3) (C), MSE focused (wMSE = 0.9, wcm = 0.09, wt = 0.01) (D). The best three solutions are shown with their IDs for each case.
Figure 8. Influence of the combination of objective weights on the selection of the optimal recipes for a sample MMDtarget: time focused (wMSE = 0.05, wcm = 0.05, wt = 0.9) (A), monomer conversion focused (wMSE = 0.01, wcm = 0.8, wt = 0.19) (B), equal weights (wMSE = 1/3, wcm = 1/3, wt = 1/3) (C), MSE focused (wMSE = 0.9, wcm = 0.09, wt = 0.01) (D). The best three solutions are shown with their IDs for each case.
Preprints 98043 g008
Figure 9. Clustering supported Pareto optimization. A: Illustration of the target cluster Rtarget. The Pareto front points RparRtarget are marked (colored markers) and are listed in Table 4. A time focused result (wcm = 0.2, wt = 0.8), an equal weighted result (wcm = 0.5, wt = 0.5) and a conversion focused result (wcm = 0.8, wt = 0.2) are highlighted (1, 2, 3). The corresponding MMDs in comparison with MMDtarget are shown in B.
Figure 9. Clustering supported Pareto optimization. A: Illustration of the target cluster Rtarget. The Pareto front points RparRtarget are marked (colored markers) and are listed in Table 4. A time focused result (wcm = 0.2, wt = 0.8), an equal weighted result (wcm = 0.5, wt = 0.5) and a conversion focused result (wcm = 0.8, wt = 0.2) are highlighted (1, 2, 3). The corresponding MMDs in comparison with MMDtarget are shown in B.
Preprints 98043 g009
Figure 10. Cluster size dependence. A-C: Clusters for MMDtarget with a total number on 20 (A), 40 (B) or 60 (C) clusters. D-F: MMDs for the Pareto front points.
Figure 10. Cluster size dependence. A-C: Clusters for MMDtarget with a total number on 20 (A), 40 (B) or 60 (C) clusters. D-F: MMDs for the Pareto front points.
Preprints 98043 g010
Figure 11. MSE limit calculated for the clustering of the search space.
Figure 11. MSE limit calculated for the clustering of the search space.
Preprints 98043 g011
Figure 12. Comparison of the direct and the clustering-supported approach considering (A) the average number of Pareto front points, (B) the average MSE range, (C) the average polymerization time range, and (D) the average conversion range.
Figure 12. Comparison of the direct and the clustering-supported approach considering (A) the average number of Pareto front points, (B) the average MSE range, (C) the average polymerization time range, and (D) the average conversion range.
Preprints 98043 g012
Table 1. Optimization variables.
Table 1. Optimization variables.
variables description restrictions
cm,0 initial monomer concentration cm,0(min)cm,0cm,0(max)
cini,0 initial initiator concentration cini,0(min)cini,0cini,0(max)
t reaction time tminttmax
r=[cm,0, cini,0, t] initial recipe
Table 2. Optimization objective functions and their calculation on the basis of simulated data.
Table 2. Optimization objective functions and their calculation on the basis of simulated data.
objectives description
m i n r  fMSE(r)= min r [ M S E ( M M D t a r g e t , M M D ( r ) ) ] minimal MSE, where MMD(r) is simulated
m i n r  fcm(r)= min r 1 f c o n v ( r ) = min r c m r c m , 0 minimal relative monomer concentration
m i n r  ft(r) = m i n r  t minimal reaction time (directly from r)
Table 3. The best recipes for each combination of objective weights (the focal weight is given in bold).
Table 3. The best recipes for each combination of objective weights (the focal weight is given in bold).
IDs of the best recipe 1 34 35 46 20 33
wi time focus (A) wi conversion focus (B)
cm,0 / mol∙L−1 2.0 2.21 2.21 2.21 2.43 2.21
cini,0 / mmol∙L−1 20 20 20 8.4 16.1 10.5
time / min 0.9 20 40 60 0.19 120 80 100
MSE /∙10−3 0.05 1.63 1.37 0.35 0.01 1.72 1.58 1.16
conversion / % 0.05 25.9 46.6 61.8 0.8 71.3 69.6 68.7
IDs of the best recipe 35 32 21 43 12 0
wi equal weights (C) wi MSE focus (D)
cm,0 / mol∙L−1 2.21 2.21 2.21 2.0 2.0 2.0
cini,0 / mmol∙L−1 20.0 10.5 13.0 12.4 10.0 8.5
time / min 1/3 60 80 80 0.01 180 200 60
MSE / ∙10−5 1/3 35 31 54 0.9 0.086 0.032 0.64
conversion / % 1/3 61.8 60.2 64.4 0.09 48.3 48.2 45.1
Table 4. Pareto front points Rpar and their recipes and objective functions. Every highlighted point (bold letters) is the best result for the corresponding example objective weights.
Table 4. Pareto front points Rpar and their recipes and objective functions. Every highlighted point (bold letters) is the best result for the corresponding example objective weights.
objective weights cand. ID cm,0 /
mol∙L−1
cini,0 /
mmol∙L−1
time /
min
MSE /
10−3
conversion /
%
0 2.00 20.0 20 1.63 25.9
time focus:
(wcm = 0.2, wt = 0.8)
1 2.21 20.0 40 1.37 46.6
2 2.43 20.0 60 1.94 62.7
equal weights:
(wcm = 0.5, wt = 0.5)
3 2.43 20.0 80 1.81 73.7
conversion focus:
(wcm = 0.8, wt = 0.2)
4 2.43 16.1 100 2.61 77.7
5 2.43 6.86 160 2.79 78.3
6 2.43 5.54 180 2.83 78.5
7 2.43 2.91 260 2.98 79.0
Table 5. Pareto optimal results with balanced weights for different number of clusters.
Table 5. Pareto optimal results with balanced weights for different number of clusters.
Number of clusters 20 40 60
cand. ID/ 3 2 1
property wi
cm,0 / mol∙L−1 3.50 3.07 2.86
cini,0 / mmol∙L−1 13.0 20.0 20.0
time / min 0.5 60.0 40.0 20.0
MSE /∙10-3 3.04 0.32 0.03
conversion / % 0.5 58.2 49.2 27.0
score (wt = 0.5, wcm = 0.5) 0.50 0.39 0.50
score (wMSE = 1/3, wcm = 1/3, wt = 1/3) 0.67 0.30 0.33
Table 6. Comparison of optimization time of direct and clustering supported approaches.
Table 6. Comparison of optimization time of direct and clustering supported approaches.
approach direct clustering supported approach
20 clusters 40 clusters 60 clusters 80 clusters 100 clusters
execution time, s 679 56 25 18 14 12
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated