Preprint Article Version 1 This version is not peer-reviewed

Evaluating Feature Impact Prior to Phylogenetic Analysis Using Machine Learning Techniques

Version 1 : Received: 16 August 2024 / Approved: 19 August 2024 / Online: 20 August 2024 (10:42:40 CEST)

How to cite: Salman, O. A.; Hosszú, G. Evaluating Feature Impact Prior to Phylogenetic Analysis Using Machine Learning Techniques. Preprints 2024, 2024081320. https://doi.org/10.20944/preprints202408.1320.v1 Salman, O. A.; Hosszú, G. Evaluating Feature Impact Prior to Phylogenetic Analysis Using Machine Learning Techniques. Preprints 2024, 2024081320. https://doi.org/10.20944/preprints202408.1320.v1

Abstract

The purpose of this paper is to describe a feature selection algorithm and its application to enhance the accuracy of the reconstruction of phylogenetic trees by improving the efficiency of tree construction. Applying machine learning models for Arabic and Aramaic scripts such as Deep Neural Networks (DNN), Support Vector Machines (SVM) and Random Forests (RF) each model was used to compare the phylogenies. The methodology was utilized with a dataset containing Arabic and Aramaic scripts, demonstrating its relevance in a range of phylogenetic analyses. Results emphasize the essential role of feature selection by DNNs is outperforming other models in terms of Area Under the Curve (AUC) and Equal Error Rate (EER) across various datasets and fold sizes. Additionally, SVM and RF models show important understandings, advantages and limits of each approach within the context of phylogenetic analysis. This method not only simplifies the tree structures but also enhances their Consistency Index. Therefore offering a robust framework for evolutionary studies. The findings highlight the applications of machine learning in phylogenetics suggesting a path toward accurate and efficient evolutionary analyses and adding a deeper understanding of evolutionary relationships.

Keywords

Feature Selection; Hyperparameters; Machine Learning; Phylogenetics; Scriptinformatics; Consistency Index (CI); False Rejection Rate (FRR); False Acceptance Rate (FAR); Classification

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.