Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Credit Card Fraud: Analysis of Feature Extraction Techniques for Ensemble Hidden Markov Model Prediction Approach

Version 1 : Received: 25 July 2024 / Approved: 26 July 2024 / Online: 26 July 2024 (12:42:13 CEST)

A peer-reviewed article of this Preprint also exists.

Ogundile, O.; Babalola, O.; Ogunbanwo, A.; Ogundile, O.; Balyan, V. Credit Card Fraud: Analysis of Feature Extraction Techniques for Ensemble Hidden Markov Model Prediction Approach. Appl. Sci. 2024, 14, 7389. Ogundile, O.; Babalola, O.; Ogunbanwo, A.; Ogundile, O.; Balyan, V. Credit Card Fraud: Analysis of Feature Extraction Techniques for Ensemble Hidden Markov Model Prediction Approach. Appl. Sci. 2024, 14, 7389.

Abstract

Credit card payment platforms are increasingly used for e-commerce activities. Credit cards are payment cards issued to individuals so they can purchase products and services based on their accumulated debt. Since credit cards are becoming very popular and widely used, the number of related fraud cases is likewise increasing. Whereas, there are large numbers of credit card transactions daily; thus, it becomes complex to differentiate between fraudulent and non-fraudulent transactions. Accordingly, different machine learning (ML) tools have been deployed in the literature to filter credit card transactions to prevent cardholders and financial institutions from losing money. In this article, the ensemble hidden Markov model (EHMM) approach is deployed to predict credit card fraud. EHMM is a popular and flexible ML tool that can easily model randomly changing datasets. Yet, the performance of the EHMM solely depends on the adopted feature extraction technique. The more reliable the feature extraction technique, the better the performance of the EHMM. In this vein, this article analyses two feature extraction techniques that can be combined with the EHMM for the prediction of credit card fraud. First, the principal component analysis (PCA) feature extraction techniques are combined with the EHMM to effectively predict credit card fraud. Yet, the PCA increases the computational time complexity of the EHMM. Therefore, this article adapts some statistical features to reduce the computational time complexity imposed on the EHMM. These robust but simple statistical features termed MRE; Mean, Relative Amplitude, and Entropy, are merged to form a feature vector that can be combined with the EHMM to effectively predict credit card frauds. The performance of the PCA-EHMM and MRE-EHMM are evaluated using the credit card transactions dataset of European cardholders gathered within two days in September 2013. The results were documented using different performance metrics such as recall/sensitivity, specificity, precision, and F1-score. The findings of this article will offer solutions to the loss experienced by cardholders and financial institutions as a result of online banking which requires credit card details.

Keywords

Credit card; entropy; EHMM; fraud prediction; MRE; mean; PCA; relative amplitude

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.