Discriminating Ion Channels from Non-Ion Channel Membrane Proteins Using Machine Learning and Deep Learning Classifiers with Protein Language Model Representations
Ion channels are integral membrane proteins that facilitate the movement of ions across cell membranes, playing a key role in a range of biological processes. The high cost and time required for wet lab experiments to characterize ion channels has spurred the development of computational methods for this purpose. In our previous work, we demonstrated the effectiveness of protein language models for ion channel prediction, using a logistic regression classifier to distinguish ion channels from non-ion channels (TooT-BERT-C) and transporters from non-transporters (TooT-BERT-T). In this study, we build upon this approach by using a combination of classical machine learning classifiers and a Convolutional Neural Network (CNN) with fine-tuned representations from ProtBERT, ProtBERT-BFD, and MembraneBERT to discriminate ion channels from non-ion channels. The results of our experiments demonstrate that TooT-BERT-CNN-C, a combination of the representations from ProtBERT-BFD and a CNN, outperforms existing state-of-the-art methods for predicting ion channels, with a Matthews Correlation Coefficient (MCC) of 0.86 and an accuracy of 98.35% on an independent test set.
Keywords:
Subject: Computer Science and Mathematics - Artificial Intelligence and Machine Learning
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.