1. Introduction
Rocks serve as the foundational constituents of the Earth and record the evolutionary narrative of our planet. They hold a pivotal role within the multidisciplinary realm of Earth sciences. As rocks are composed of a variety of minerals, the accurate identification of minerals is of paramount importance [
1,
2]. Traditional mineral identification techniques primarily rely on the visual observation of physical properties like shape, color, and texture, but their precision is contingent upon the expertise of the observer [
1,
2]. Alternatively, although methods such as chemical analysis, X-ray diffraction analysis, differential thermal analysis, and polarizing microscope analysis offer enhanced accuracy in mineral identification, the defects of these methods are expensive, a long time to execute, and especially sample damage [
3,
4,
5,
6,
7]. In contrast to these resource-intensive approaches, the acquisition of mineral images is an expedient, efficient, and cost-effective avenue for analysis. Consequently, an increasing body of research has begun to pivot towards mineral identification through image-based techniques.
In particular, numerous studies have harnessed the potential of deep learning to identify minerals from images, yielding commendable results [
3,
4,
5,
6,
7]. However, it is crucial to underscore the intrinsic limitation of traditional deep learning methodologies, as they can exclusively identify minerals within the purview of the training dataset's distribution. Any mineral falling beyond the confines of this distribution is erroneously categorized within one of the predefined classes from the training datasets—an evident and undesirable misclassification. This limitation is exacerbated by the extensive diversity with more than 6000 known mineral categories worldwide [
1], rendering it impractical to encompass all of them within the training datasets. Minerals that fall outside the scope of the training datasets necessitate distinct methods for isolation and identification, such as manual or instrumental techniques.
Current strategies addressing the inherent limitation of conventional deep learning models, specifically their ability to exclusively recognize in-distribution (ID) categories from the training set, encompass an array of techniques, including out-of-distribution (OOD) detection [
8], uncertainty estimation [
9], semi-supervised learning [
10], and generative models [
11]. Notably, OOD detection methodologies have emerged as particularly reliable, affording accurate predictions for samples existing outside the training set distribution and necessitating solely in-distribution data for training [
12,
13,
14,
15,
16].
Exemplifying the efficacy of OOD detection, Jiang et al. [
17] adeptly employed this technique to discern between known and unknown instances of plant diseases, while Saadati et al. [
18] similarly conducted OOD detection to bolster the robustness of insect classification models. Furthermore, the utility of OOD detection extends beyond these domains, showcasing notable promise in the arenas of medical image diagnosis [
19], network security [
20], and quality control [
21]. In light of these compelling precedents, it becomes evident that the isolation and identification of out-of-distribution minerals require specific attention. The main contributions of the paper are as follows:
- (1)
OOD detection is adopted for the identification of minerals residing outside the training set's distribution, providing an opportunity for further identification of these instances.
- (2)
A machine learning model that combines One-Class Support Vector Machines (OCSVM) with ResNet is designed for mineral identification.
- (3)
Comprehensive experiments show the high performance of the proposed model.
2. Datasets
In this study, we collect a comprehensive dataset of 183,688 mineral images, encompassing 36 distinct categories of common minerals, as detailed in
Table 1. These images were meticulously curated, drawing from the diligent efforts of Zeng et al. [
6] and Wu et al. [
3], and sourced from the reputable repository of mineral data, Mindat.org [
22]. Notably, the dataset is divided into training, validation, and testing subsets, each allocated in a ratio of 8:1:1, respectively. Some of the 36 catagories of the mineral images are shown in
Figure 1. In addition to the in-distribution dataset, a separate collection of 18,368 mineral images is amassed. These images correspond to 15 categories of minerals, as cataloged in
Table 2, and have been acquired from the same authoritative source, Mindat.org. This auxiliary dataset, representative of out-of-distribution minerals, has been assembled to assess the model's proficiency in recognizing and distinguishing mineral types beyond the purview of the training set. Some of the images of the out-of-distribution minerals are shown in
Figure 2.
3. Methodology
The methodology employed for mineral identification is illustrated in
Figure 3. To discern minerals that fall outside the established set of 36 known minerals, One-Class Support Vector Machines (OCSVM) is leveraged for Out-of-Distribution (OOD) detection. Similar to the techniques outlined in previous works [
23,
24,
25], the process initiates with feature extraction from the mineral image, with the intent of refining and augmenting the efficacy of OCSVM [
23,
24,
25]. Crucially, a Deep Neural Network (DNN) is integrated into our model for the extraction of mineral-specific features. The DNN is meticulously trained on the training set, which comprises the 36 recognized mineral categories as shown in
Table 1. Subsequently, OCSVM is deployed, with the mineral features derived from the initial layers of the DNN serving as input. This pivotal step serves to ascertain whether the mineral in question pertains to the in-distribution category of the 36 known minerals or falls into the realm of out-of-distribution. Upon OCSVM's determination that the mineral is classified as out-of-distribution, the model promptly halts and apprises the user that the input image represents an unknown mineral. In contrast, when OCSVM identifies the mineral as in-distribution, the model seamlessly proceeds to deploy the remaining layers of the DNN to apprise the user of the specific known mineral category to which the input image belongs.
3.1. Mineral Feature Extraction
The mineral feature extraction process capitalizes on the remarkable image classification capabilities of ResNet, a convolutional neural network architecture with a proven track record [
26]. Assuming the feature extracted by ResNet is written as
(W is the width, H is the height, and C is the number of channels of the feature extracted). To enhance the performance of OCSVM in the context of OOD detection, a pivotal dimensionality reduction step is introduced. This process is elucidated by Formula (1), which involves the concatenation of individual channel values,
to create a more concise representation of
. Each
corresponds to the value derived from the kth channel within the mineral feature map, as elucidated in Formula (2). This dimensionality reduction facilitates the OOD detection process and bolsters the overall performance of the model.
3.2. OOD Detection by OCSVM
To ascertain whether an input image pertains to the in-distribution category of the 36 known minerals, the mineral-specific features x extracted from the DNN, are provided as input to the OCSVM. These features undergo a crucial transformation, being mapped to a higher-dimensional space, as outlined in Formula (3).
The classification outcome for the input image hinges on the result of Formula (3): if this result surpasses zero, the image is identified as an in-distribution mineral; conversely, if the result is less than or equal to zero, the image is categorized as an out-of-distribution mineral. In Formula (3),
designates the sign function, xi corresponds to the features derived from the ith known mineral training data.
represents the Radial Basis Function (RBF), as expounded in Formula (4), responsible for the transformation of the known mineral training data into a higher-dimensional space with the objective of maximizing the separation between these training data points and the origin within that space. The parameters
and
are determined through the training process using the known mineral training datasets.
In Formula (4), the parameter denoted as represents the bandwidth, a pivotal factor governing the behavior of the Radial Basis Function (RBF). The significance of within this context is notably profound, as its magnitude inherently influences the classification process. Specifically, a larger value of tilts the balance toward categorizing a greater number of in-distribution samples as out-of-distribution, while conversely, a smaller biases the model toward classifying a greater proportion of out-of-distribution samples as in-distribution. In alignment with prior research and in accordance with established convention, the present study maintains at the value 1/|x|. It is essential to underscore that |x| in this context designates the feature dimension.
4. Experimental Results and Analysis
The model's implementation is facilitated through the utilization of the Python programming language, executed on a Linux environment, while drawing upon the robust framework provided by Keras, Tensorflow, and Sklearn. In pursuit of optimal efficiency during the DNN training process, a GPU (Graphics Processing Unit) is judiciously employed. The precise specifications of the experimental configuration are comprehensively detailed in
Table 3 for reference.
4.1. Evaluation Metrics
The evaluation of the model's performance hinges on two key metrics: OOD Detection Accuracy and Mineral Identification Accuracy. These metrics serve as crucial indicators of the model's proficiency in its respective tasks. OOD Detection Accuracy, a binary classification metric, assesses the model's effectiveness in distinguishing whether a mineral is in-distribution or out-of-distribution. This metric includes three essential components: ID Accuracy, OOD Accuracy, and Overall Accuracy, which are calculated as that in Formula (5), (6) and (7). ID Accuracy gauges the ratio of correctly identified in-distribution minerals to the total known mineral test datasets. Conversely, OOD Accuracy quantifies the ratio of correctly identified out-of-distribution minerals to the overall count within the unknown mineral datasets. Notably, the Overall Accuracy mirrors the average of ID Accuracy and OOD Accuracy, given that the known and unknown mineral test data are maintained at equal proportions in this study. Mineral Identification Accuracy, a metric applicable to multi-class classification, evaluates the model's capacity to correctly identify minerals within their respective categories. This metric, akin to OOD Detection Accuracy, contains the trio of ID Accuracy, OOD Accuracy, and Overall Accuracy, but focuses on the performance of the model in identifying the concrete categories of in-distribution and out-of-distribution minerals. These rigorous and multifaceted metrics offer a comprehensive assessment of the model's performance in distinguishing between mineral categories and detecting minerals that deviate from the established training datasets.
4.2. Mineral Features Selection
As expounded in
Section 3, the mineral features are meticulously extracted by the well-trained ResNet prior to OCSVM detection. In the case of ResNet50, a total of 49 mineral features can be derived from this process. To ascertain the optimal mineral features for OCSVM OOD detection, each of the 49 sets of features is independently subjected to OCSVM analysis, yielding 49 distinct accuracy values. The culmination of this analysis is graphically presented in
Figure 4, showcasing the Overall Accuracy associated with each mineral feature extracted by the 49 layers of ResNet.
Upon careful examination of
Figure 4, it becomes evident that the mineral features extracted by the second layer of ResNet50 emerge as the most promising, attaining a remarkable Overall Accuracy of 82.1%. Consequently, the features derived from the second layer of ResNet50 are judiciously chosen as the prime candidates for OCSVM-based OOD detection, given their demonstrably robust performance.
4.2. Performance
Table 4 presents a comprehensive overview of the OOD Detection Accuracy and Mineral Identification Accuracy, offering profound insights into the model's performance. Notably, this analysis reveals that the model excels in its ability to correctly identify 82.1% of the test minerals as either known or unknown categories, with 96.4% accuracy achieved in discerning in-distribution test minerals as known categories. Moreover, 67.8% of the out-of-distribution test minerals are adeptly classified as unknown categories, substantiating the model's competence in addressing the challenge of minerals that deviate from the training set. As highlighted in the introduction section, contemporary mineral image identification methods are often constrained to categorize minerals within the bounds of the training set's distribution, leading to erroneous identifications of out-of-distribution minerals. In this context, the model distinguishes itself by achieving a 67.8% accuracy in classifying out-of-distribution minerals as unknown categories. This OOD Accuracy is lower than that of other applications listed in references [
17,
18,
19,
20,
21] because minerals of the same category may have different colors and textures, while different categories of minerals may have the same colors and textures [
6]. This makes mineral identification more challenging, resulting in similarly lower ID Accuracy than other applications. The model attains a commendable 74.1% accuracy in identifying in-distribution minerals through the utilization of the state-of-the-art convolutional neural network, ResNet. The performance of each of the 36 known mineral categories is presented in
Figure 5, affording a granular understanding of the model's accuracy across distinct mineral types.
Additionally, a comparative analysis with other related studies, detailed in
Table 5, underscores the model's superiority. Compared with the study of Zeng et al. [
6], which employed the same dataset of 36 known minerals, the model exhibits marginally lower ID Accuracy but substantially higher OOD Accuracy. Notably, the model surpasses other related studies in OOD Accuracy, highlighting its proficiency in mineral identification tasks beyond the training set's confines.
5. Conclusions
A novel model designed to excel in the task of identifying out-of-distribution minerals, harnessing the combined capabilities of OCSVM and the ResNet50 network is introduced. OCSVM plays a pivotal role in classifying mineral features extracted through ResNet50, endowing the model with the capacity to detect both out-of-distribution and in-distribution minerals. In comparison to traditional methods reliant on labor-intensive and time-consuming experimental mineral species determination, the approach emerges as a more practical, expedient, and cost-effective alternative. Additionally, when contrasted with other conventional deep learning methodologies, the model exhibits the unique capability to differentiate out-of-distribution minerals, addressing a critical limitation in the field of mineral identification. Further expanding the in-distribution datasets would enhance the model's performance and its broader applicability in the field of mineral identification.
Author Contributions
Conceptualization, X.J. and Y.Y.; methodology, Y.Y.; software, Y.Y. and K.L.; validation, M.Y., M.H. and Z.Z.; formal analysis, X.J. and Y.Y; investigation, Y.Y.; resources, Y.Y. and K.L.; data curation, Y.Y.; writing—original draft preparation, X.J. and K.L.; writing—review and editing, Z.Z., S.Z. and Y.W.; visualization, K.L.; supervision, X.J. and Z.Z.; project administration, X.J.; funding acquisition, X.J. M.Y. and M.H.. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the Program of National Mineral Rock and Fossil Specimens Resource Center from MOST, grant number NCSTI-RMF20240109.
Data Availability Statement
The data that support the findings of this study are available on request from the corresponding author, X.J.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Nesteruk, S.; Agafonova, J.; Pavlov, I. et al. MineralImage5k: A benchmark for zero-shot raw mineral visual recognition and description. Computers and Geosciences, 2023, 178: 105414.
- Lou, W.; Zhang, D.; Bayless, R. C. Review of mineral recognition and its future. Applied Geochemistry, 2020, 122, 104727. [Google Scholar] [CrossRef]
- Wu, B.; Ji, X.; He, M.; Yang, M.; Zhang, Z.; Chen, Y.; Wang, Y.; Zheng, X. Mineral Identification Based on Multi-Label Image Classification. Minerals, 2022, 12(11), 1338. [CrossRef]
- Singh, T.; Jhariya, D.C.; Sahu, M.; Dewangan, P.; Dhekne P.Y. Classifying Minerals using Deep Learning Algorithms. Earth and Environmental Science, 2022, 1032(1), 012046. [CrossRef]
- Jia, L.; Yang, M.; Meng, F.; He M.; Liu, H. Mineral Photos Recognition Based on Feature Fusion and Online Hard Sample Mining. Minerals, 2021, 11(12), 1354. [CrossRef]
- Zeng, X.; Xiao, Y.; Ji, X.; Wang, G. Mineral Identification Based on Deep Learning That Combines Image and Mohs Hardness. Minerals, 2021, 11(5), 506. [CrossRef]
- Liu, C.; Li, M.; Zhang, Y.; Han, Y; Zhu, Y.; An enhanced rock mineral recognition method integrating a deep learning model and clustering algorithm. Minerals, 2019, 9(9),516. [CrossRef]
- Yang, J.; Zhou, K.; Li, Y.; Liu, Z. Generalized Out-of-Distribution Detection: A Survey. arXiv, 2022, 2110, 11334. [Google Scholar] [CrossRef]
- Loquercio, A.; Segu, M.; Scaramuzza, D. A General Framework for Uncertainty Estimation in Deep Learning. IEEE Robotics and Automation Letters, 2020, 5(2), 3153-3160. [CrossRef]
- Van Engelen, J.E., Hoos, H.H. A survey on semi-supervised learning. Machine Learning, 2020, 109(2), 373-440. [CrossRef]
- Cai, M., Li, Y.; Out-of-Distribution Detection via Frequency-Regularized Generative Models. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, 5521-5530. [CrossRef]
- Hsu, Y. C.; Shen, Y.; Jin, H.; Kira, Z. Generalized odin: Detecting out-of-distribution image without learning from out-of-distribution data. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, 10951-10960. [CrossRef]
- Liang, S.; Li, Y.; Srikant, R. Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks. arXiv, 2017, 1706, 02690. [Google Scholar] [CrossRef]
- DeVries, T.; Taylor, G.W. Learning Confidence for Out-of-Distribution Detection in Neural Networks. arXiv, 2018, 1802, 04865. [Google Scholar] [CrossRef]
- Bendale, A.; Boult, T. E. Towards open set deep networks. Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, 1563-1572. [CrossRef]
- Hendrycks, D.; Gimpel, K. A Baseline for Detecting Misclassified and Out-of-Distribution Examples. arXiv, 2016, 1610, 02136. [Google Scholar] [CrossRef]
- Jiang, K.; You, J.; Dorj, U.; Kim, H. Detection of unknown strawberry diseases based on OpenMatch and two-head network for continual learning. Frontiers in Plant Science, 2022, 13, 989086. [Google Scholar] [CrossRef]
- Saadati, M.; Chiranjeevi, S.; Balu, A.; Jubery, T.Z.; Asheesh, K.S.; Soumik, S.: Arti, S.; Baskar G. Out-of-distribution algorithms for robust insect classification. 2nd AAAI Workshop on AI for Agriculture and Food Systems, 2023. [CrossRef]
- Zhang, O.; Delbrouck, JB.; Rubin, D.L. Out of Distribution Detection for Medical Images. In: Sudre, C.H., et al. Uncertainty for Safe Utilization of Machine Learning in Medical Imaging, and Perinatal Imaging, Placental and Preterm Image Analysis. UNSURE PIPPI 2021 2021. Lecture Notes in Computer Science, vol 12959. Springer, Cham. [CrossRef]
- Mattei, E.; Dalton, C.; Draganov, A. Feature Learning for Enhanced Security in the Internet of Things. 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP). IEEE, 2019, 1-5. [CrossRef]
- Lindgren, E.; Zach, C. Autoencoder-Based Anomaly Detection in Industrial X-ray Images. Quantitative Nondestructive Evaluation. American Society of Mechanical Engineers, 2021, 85529, V001T07A001. [CrossRef]
- A Mineral Database. Available online: https://www.mindat.org/. (accessed on 5 May 2024).
- Erfani, S. M.; Rajasegarar, S.; Karunasekera, S.; Leckie, C. High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recognition, 2016, 58, 121–134. [Google Scholar] [CrossRef]
- Widodo, A.; Yang, B. S.; Han, T. Combination of independent component analysis and support vector machines for intelligent faults diagnosis of induction motors. Expert systems with applications, 2007, 32(2), 299-312. [CrossRef]
- Shen, K.; Ong, C.; Li, X., et al. Feature selection via sensitivity analysis of SVM probabilistic outputs. Machine Learning 2008, 70, 1–20. [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition Proceedings of the IEEE conference on computer vision and pattern recognition. 2016, 770-778. [CrossRef]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).