Book cover

Soft Computing for Intelligent Systems pp 157–166 Cite as

Credit Card Fraud Detection Techniques: A Review

  • Ankit Mohari 8 ,
  • Joyeeta Dowerah 8 ,
  • Kashyavee Das 8 ,
  • Faiyaz Koucher 8 &
  • Dibya Jyoti Bora 8 , 9  
  • Conference paper
  • First Online: 23 June 2021

586 Accesses

2 Citations

Part of the book series: Algorithms for Intelligent Systems ((AIS))

The main problems within the credit card trade are ongoing fraud. The credit card has made our life easy as we can pay easily and move without carrying any cash. Credit card gains its popularity and utilization has dramatically inflated in our day to day life, for the speedy advancement of electronic commerce technology. However, the exploitation of credit card provides huge edges once used fastidiously and responsibly. Fraud activities are also increasing, and new techniques have been developed by criminals. Credit card and monetary damages are caused by fallacious activities. Such issues are tackled with Data Science, Machine Learning together with Deep Learning techniques, which cannot be exaggerated. This helps the bank and financial organizations, to detect the fraud at the early stage, and then they can reduce the ongoing fraud by not accepting the suspected transactions. The credit card company faces a huge loss if the cardholder does not detect the loss. An awfully very little quantity of data is needed by the assaulter for conducting any fallacious dealing in online transactions. During analysis work, numerous methods and outcomes are reviewed, in terms of definite parameters.

  • Credit card fraud detection
  • Machine learning algorithm

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Chaudhary K, Yadav J, Mallick B (2012) A review of fraud detection techniques: credit card. Int J Comput Appl 45(1)

Google Scholar  

Ratna Sree Valli K, Jyothi P,Varun Sai G, Rohith Sai Subash R (2020) Credit card fraud detection using machine learning algorithms. Quest J Res Humanities Social Sci 8(2): 04–11 ISSN(Online): 2321–9467

Mehndiratta S, Gupta K (2018) Credit card fraud detection techniques: a review. IJCSMC 8(8)

Kazemi ZH (2017) Using deep networks for fraud detection in the credit card transactions. In: IEEE 4th International conference in knowledge-based engineering and innovation (KBEI). pp 0630–0633

Al-Khatib AM (2012) Electronic payment fraud detection techniques. World Comput Sci Info Technol J (WCSIT) 2(4):137–141 ISSN: 2221–0741

Patidar R, Sharma L (2011) Credit card fraud detection using neural network. Int J Soft Comput Eng (IJSCE)

Sisodia DS, Reddy NK, Bhandari S (2017) Performance evaluation of class balancing techniques for credit card fraud detection. In: 2017 IEEE International conference on power, control, signals

Liu G, Li Z, Zheng L, Wand S, Xuan CJ (2011) Random forest for credit card fraud detection. In: IEEE 15th International conference on networking, sensing and control (ICNSC)

Roy A, Sun J, Mohoney R, Alonzi, Adams S, Beling P (2006) Deep learning detecting fraud in credit card transactions. Syst Appl 31(2):337–344

Pojee D, Zulphekari S, Rarh F, Shah V (2017) Secure and quick NFC payment with data mining and intelligent fraud detection. In: 2017 2nd International conference on communication and electronics systems (ICCES)

Estevez PA, Held CM, Perez CA (2006) Subscription fraud prevention in telecommunications using fuzzy rules and neural networks. Expert Syst Appl 31(2):337–334

Choi D, Lee K (2018) An artificial intelligence approach to financial fraud detection under IoT environment: a survey and implementation

Feedzai IC, Foumier F, Skarbovsky I (2015) The uncertain case of credit card fraud detection. In: The 9th ACM international conference on distributed event based systems(DEBS15)

Phua C, Lee V, Smith, Gayler KR (2010) A comprehensive survey of data mining-based fraud detection research. arXiv preprint arXiv:1009.6119

Saini A, Sarkar SD, Ahmed S, Maniraj SP (2019) Credit card fraud detection using machine learning and data science. Int J Eng Res Technol 8(09) ISSN: 2278–0181

Sorournejad S, Zojaji Z, Atani R, Monadjemi AH (2016) A survey of credit card fraud detection techniques: data and technique oriented perspective. 22:46:13 UTC

Download references

Acknowledgements

We would like to take the opportunity to thank Dr. Dibya Jyoti Bora, Assistant Professor, Kaziranga University to provide the necessary support and suggestions for our research work.

Author information

Authors and affiliations.

Department of Information Technology, School of Computing Science, Kaziranga University, Assam, India

Ankit Mohari, Joyeeta Dowerah, Kashyavee Das, Faiyaz Koucher & Dibya Jyoti Bora

Assistant Professor, Department Of Information Technology, School of Computing Sciences, Kaziranga University, Assam, India

Dibya Jyoti Bora

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ankit Mohari .

Editor information

Editors and affiliations.

Department of Electronics and Communication Engineering, Kurukshetra University, Kurukshetra, Haryana, India

Nikhil Marriwala

University Institute of Engineering and Technology (UIET), Kurukshetra University, Kurukshetra, Haryana, India

C. C Tripathi

Department of Electronics and Communication Engineering, Jaypee University of Information Technology, Waknaghat, Himachal Pradesh, India

Shruti Jain

Co-Founder of Dew Mobility, Santa Clara University, Santa Clara, CA, USA

Shivakumar Mathapathi

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper.

Mohari, A., Dowerah, J., Das, K., Koucher, F., Bora, D.J. (2021). Credit Card Fraud Detection Techniques: A Review. In: Marriwala, N., Tripathi, C.C., Jain, S., Mathapathi, S. (eds) Soft Computing for Intelligent Systems. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-16-1048-6_12

Download citation

DOI : https://doi.org/10.1007/978-981-16-1048-6_12

Published : 23 June 2021

Publisher Name : Springer, Singapore

Print ISBN : 978-981-16-1047-9

Online ISBN : 978-981-16-1048-6

eBook Packages : Intelligent Technologies and Robotics Intelligent Technologies and Robotics (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Credit card fraud detection using a hierarchical behavior-knowledge space model

Roles Funding acquisition, Writing – review & editing

* E-mail: [email protected]

Affiliations Department of Electronic and Electrical Engineering, Brunel University London, Uxbridge, UB8 3PH, United Kingdom, Visiting Professor, School of Electronic and Information Engineering, Tongji University, Shanghai, China

ORCID logo

Roles Data curation, Formal analysis, Methodology, Writing – original draft

Affiliation Faculty of Engineering, Computing and Science, Swinburne University of Technology (Sarawak Campus), Malaysia

Roles Investigation, Methodology, Resources

Roles Resources, Validation, Visualization, Writing – review & editing

Affiliation Econometrics and Business Statistics, School of Business, Monash University Malaysia, Selangor, Malaysia

Roles Supervision, Validation, Writing – review & editing

Affiliation Institute for Intelligent Systems Research and Innovation, Deakin University, Geelong, Victoria, Australia

  • Asoke K. Nandi, 
  • Kuldeep Kaur Randhawa, 
  • Hong Siang Chua, 
  • Manjeevan Seera, 
  • Chee Peng Lim

PLOS

  • Published: January 20, 2022
  • https://doi.org/10.1371/journal.pone.0260579
  • Reader Comments

Table 1

With the advancement in machine learning, researchers continue to devise and implement effective intelligent methods for fraud detection in the financial sector. Indeed, credit card fraud leads to billions of dollars in losses for merchants every year. In this paper, a multi-classifier framework is designed to address the challenges of credit card fraud detections. An ensemble model with multiple machine learning classification algorithms is designed, in which the Behavior-Knowledge Space (BKS) is leveraged to combine the predictions from multiple classifiers. To ascertain the effectiveness of the developed ensemble model, publicly available data sets as well as real financial records are employed for performance evaluations. Through statistical tests, the results positively indicate the effectiveness of the developed model as compared with the commonly used majority voting method for combination of predictions from multiple classifiers in tackling noisy data classification as well as credit card fraud detection problems.

Citation: Nandi AK, Randhawa KK, Chua HS, Seera M, Lim CP (2022) Credit card fraud detection using a hierarchical behavior-knowledge space model. PLoS ONE 17(1): e0260579. https://doi.org/10.1371/journal.pone.0260579

Editor: Alfredo Vellido, Universitat Politecnica de Catalunya, SPAIN

Received: May 8, 2021; Accepted: November 12, 2021; Published: January 20, 2022

Copyright: © 2022 Nandi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant benchmark data are within the manuscript, given in references [ 24 ], [ 25 ], and [ 26 ]. Relevant real data records are available from a public repository: https://doi.org/10.6084/m9.figshare.17030138 .

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

1. Introduction

Classification has been a key application area of machine learning. A classifier learns a mathematical model from training data samples that maps input features to the target classes or labels [ 1 ]. Given a new unseen data sample, the trained classifier is used to provide a prediction of the target class [ 2 ]. It is, however, not easy to use single or few input variables only to differentiate multiple classes to their fullest [ 1 ]. In many classifiers such as neural networks, k -nearest neighbors ( k NN), Support Vector Machine (SVM), and Naïve Bayes (NB), the underlying assumption is that training data samples contain a valid representation of the population of interest, which normally require a balanced sample distribution [ 3 ]. It has been empirically observed that building an accurate classifier based on a single paradigm is often ineffective, if not impossible [ 2 ].

Establishing an accurate classifier is not an easy task, as each classification method has its own advantages and disadvantages. As a result, the concept of classifier fusion using multiple classifiers has become one of the most significant methodologies to improve the classification performance. All classifiers provide their predictions of the class of an incoming data sample, and these predictions are analyzed and combined using some fusion strategy [ 4 ]. In this regard, selections of appropriate classifiers for constructing an ensemble classification model remain a difficult task [ 2 ].

It is a well-established notion in the literature that a classifier combination offers a viable alternative to yield better results than those from a single classifier. This is however dependent on how independent and diverse the classifiers are. Diversity among the chosen classifiers is an important factor for building a successful multi-classifier system (MCS). Various MCS methods have been proposed in modelling and handling different types of data [ 5 ]. Research in this area has led to the development of MCS models that combine the strengths of various individual classifiers, which are built using different training paradigms, to provide improved and robust classification performance [ 2 ].

With the rapid growth in e-commerce, the number of credit card transactions has been on the rise [ 6 ]. Alongside this growth, the issue of credit card fraud has become serious and complicated [ 7 ]. Generally, fraud detection solutions can be divided into supervised and unsupervised classification methods [ 8 ]. In supervised methods, the classification models are based on different samples of genuine and fraudulent transactions, while in unsupervised methods, outliers are detected from the data samples [ 9 ]. Merchants are responsible for paying the bill when a fraud occurs through an online or in-store transaction [ 10 ]. In this paper, we focus on the design and application of an ensemble classification model for credit card fraud detection, which is regarded as a significant problem in the financial sector. Indeed, billions of dollars are lost annually due to credit card fraud, and both merchants and consumers are significantly affected by the consequences of fraud [ 11 ]. With the advancement in fraud detection methods, fraudsters are finding new methods to avoid detection. Capturing irregular transaction patterns is a vital step in fraud detection [ 12 ], and efficient and effective classification methods are required for accurate detection of credit card frauds.

Two main methods are compared, namely majority voting and Behavior-Knowledge Space (BKS) [ 13 ] in this paper. Majority voting is simple but effective method, where an odd number of constituent classifiers is used for a decision in an ensemble. On the other hand, BKS considers the predictive accuracy of each classifiers and use this extra information to aggregate predictions from individual classifiers and derive better results. The main contribution of this paper is the formulation of an ensemble MCS model with the BKS for detection of real-world credit card fraud. The proposed model allows the MCS to accumulate its knowledge and yield better results over time.

The organization of this paper is as follows. A literature review on different types of MCS is presented in Section 2. Designs of the MCS model with BKS are explained in Section 3. A series of empirical evaluation on credit card fraud using publicly available data as well as real-world data from our collection is presented in Section 4. A summary of the findings is given in Section 5.

2. Literature review

An MCS model commonly includes a decision combination method for combining predictions from an ensemble of classifiers. A number of applications using MCS models have been developed over the years. In this section, we present a literature review on different classifier configurations, starting from two classifiers to four or more classifiers.

2.1 Two classifiers

An ensemble classification model using k NN and SVM was presented in [ 14 ] to classify electrocardiogram (ECG) signals. The proposed model achieved an accuracy score of 0.752 as compared with 0.561 to 0.737 from other classifiers [ 14 ]. In financial market trading, an automated framework was presented in [ 15 ], and an MCS model was used a weighted multi-category generalized eigenvalue SVM and Random Forest (RF) to generate the buy or sell signals. Evaluated with five index returns, including those from NASDAQ and DOW JONES, the MCS model achieved notable improvements over the buy/hold strategy as compared with the outcomes from other algorithms [ 15 ].

Predictions of severity pertaining to abnormal aviation events with risk levels were conducted in [ 16 ] using an MCS framework consisting of SVM and deep learning models. The SVM was used for discovering the relationships between event synopsis and consequences, while deep learning was deployed in training. Using cross-validation, the proposed MCS model achieved 81% accuracy, which are 3% and 6% higher than standalone SVM and deep learning models, respectively [ 16 ].

In [ 17 ], an MCS model based on dynamic weights was developed. The MCS model comprised a backpropagation neural network and the nearest neighbour algorithm, which dynamically assigned a fusion weight to a classifier. Using several public face databases, the proposed method obtained better classification accuracy rates as compared with those from individual classifiers [ 17 ]. An MCS model was proposed for face image segmentation in [ 18 ]. A total of three Bayes and one SVM were used in the MCS model. An error rate of 13.9% was achieved, as compared with 50% from standard classifiers, for hair across eyes requirements [ 18 ].

2.2 Three classifiers

In [ 2 ], an MCS was designed using stacked generalization based on DT (Decision Tree), k NN, and NB. A total of 20 different UCI data sets were used in the experiments. Based on a breast cancer data set, an accuracy rate of 74.8% was achieved by the MCS model, as compared with 71.2% from other classifiers [ 2 ]. An adaptive MCS model for gene expression was examined in [ 4 ]. Particle swarm optimization, bat-inspired algorithm, and SVM were used in the ensemble model, which showed significant improvements in classification performance with respect to breast cancer and embryonal tumors, where the training error reduced by up to 50% [ 4 ].

In [ 19 ], an MCS model to maximize the diagnostic accuracy of thyroid detection. The model utilized SVM, NB, k NN, and closest matching rule classifiers to yield the best diagnostic accuracy. The proposed system achieved an accuracy of 99.5% as compared with 99.1% from the best individual classifier in automatically discriminating thyroid histopathology images as either normal thyroid or papillary thyroid carcinoma [ 19 ]. An MCS framework to exploit unlabelled data was detailed in [ 20 ]. The MCS model was built using NB, SVM, and k NN. A total of five text classification data sets were used in the experiments. The highest accuracy rate of 83.3% was achieved by the MCS model, as compared with those from other algorithms [ 20 ].

2.3 Four or more classifiers

An adaptive MCS model for oil-bearing reservoir recognition was presented in [ 5 ]. A total of five classifiers were used, namely C4.5, SVM, radial basis function, data gravitation-based, and k NN algorithms. A number of rules were included in the adaptive MCS model as well. The proposed solution achieved perfect accuracy in recognizing the properties of different layers in the oil logging data [ 5 ]. An advanced warning system was designed in [ 21 ] using an MCS approach for outward foreign direct investment. Logistic regression, SVM, NN, and decision trees were used in the MCS model, which was applied to resource-based enterprises in China. The experimental results indicated the MCS model was able to yield an accuracy score of 85.1%, as compared with 82.5% from a standard neural network model [ 21 ].

In [ 22 ], estimations of precipitation from satellite images were carried out with an MCS model, which combined RF, NN, SVM, NB, weighted k NN, and k -means together. A total of six classes of precipitation intensities were obtained, from no rain to very high precipitation. A score of 0.93 for the coefficient of correlation was yielded by the proposed method, as compared with only 0.46 from other methods [ 22 ]. In [ 23 ], a one-against-one method was explored using MCS that consisted of NN, DT, k NN, SVM, linear discriminant analysis, and logistic regression. An error rate of 0.99% was produced by the MCS model, as compared with 14.9% from other methods on the zoo data set [ 23 ]. In [ 24 ], sentiments of tweets are automatically classified either positive or negative using an ensemble. Public tweet sentiment datasets are used in the experiment. The ensemble is formed using multinomial NB, SVM, RF, and logistic regression. An accuracy rate of 81.06% was achieved on a dataset trained with only 0.03% of the obtained data [ 24 ].

2.5 Remarks

Based on the above review that focuses on various classifier configurations (from two or more classifiers), it is clear that MCS has been used in various applications, including finance, medical, engineering and other sectors. The MCS configuration offers the advantage that the output is not constrained by one classifier, with a pool of classifiers to provide the possibility of improved results. In the event that one classifier produces an incorrect prediction while other counterparts yield a correct one, the combined output can be correct, e.g. in accordance with the majority voting principle. The combined output is, therefore, able to reduce the number of incorrect predictions from single classification method. The results from various MCS configurations reported in the literature are promising, with typically higher accuracy rates. However, MCS-based methods tend to run slower, since a higher computation load is required for execution of multiple classifiers, although this is not regularly reported in the literature. While better results often outweigh longer computational durations, it is useful to ensure that MCS configurations are feasible in terms of computational requirements for practical applications in real-world environments.

3. Classification methods

In this study, several standard machine learning models from H2O.ai were employed to establish an MCS model. The Python software running on the Google Colab environment was used. In the following sub-sections, the majority voting and the BKS model by Huang and Suen [ 25 ] for decision combination is explained.

3.1 Majority voting

Given M target classes in which each class is represented by C i , ∀ i ∈ Λ = { 1 , 2 ,…, M }. The classifier task is to categorize an input sample, x , to one of the ( M + 1 ) classes, with the ( M + 1 )th class denoting that the classifier rejects x .

literature review for credit card fraud detection

A BKS is a K -dimensional space, where every dimension indicates the decision (i.e., predicted class) from one classifier. The intersection of the decisions from K different classifiers occupies one unit in the BKS, e.g., BKS ( e 1 ( x ) = j 1 ,…, e K ( x ) = j K ) denotes a unit where each e k produces a prediction j k , k = 1 ,…, K . In each BKS unit, there are M partitions (cells), which accumulate the number of data samples actually belonging to C i .

Consider an example with two classifiers. A two-dimensional (2–D) BKS can be formed, as given in Table 1 .

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0260579.t001

literature review for credit card fraud detection

The BKS has similarity with the confusion matrix. With the Bayesian approach, multiplication of evidence from the confusion matrices is required to estimate the joint probability of K events when combining the predictions. This step is eliminated in the BKS method, where a final decision is reached by giving the input sample directly to the class that has gathered the greatest number of samples. This simple method of BKS gives a fast and efficient method for combining various decisions, as shown in [ 25 ] for classification of unconstrained handwritten numerals.

A hierarchical agent-based framework with the BKS for decision combination is proposed. As shown in Fig 1 , the framework has N agent groups in the base layer, with each group comprises multiple individual agents. The agents can be machine learning models, statistical methods as well as other classification algorithms. A manager agent is assigned to combine the predictions from each agent group using a BSK. Each manager agent sends its prediction to a decision combination module comprising another BKS in the top layer that produces the final combined prediction.

thumbnail

https://doi.org/10.1371/journal.pone.0260579.g001

A numerical example is presented to better illustrate the BKS mechanism. In Table 2 , a simple binary classification problem is shown. There are two agents (classifiers) and six input samples, along with their predicted and actual classes. A BKS can be constructed, as shown in Table 3 . As an example, for input samples 1 and 4 ( Table 2 ), both agents 1 and 2 predict class 1, and the actual class is 1. This information is recorded in the highlighted (grey) BKS unit in Table 3 . Given a new test sample, the predictions from all agents are used to activate a BKS unit, and the combined predicted class (final output) is reached based on the highest number of samples from the majority class, as given in Eq ( 5 ). Whenever the highlighted (grey) BKS unit is activated during the test phase, the combined (final) prediction is Class 1.

thumbnail

https://doi.org/10.1371/journal.pone.0260579.t002

thumbnail

https://doi.org/10.1371/journal.pone.0260579.t003

4. Experiments

In this empirical evaluation, publicly available data sets from UCI Machine Learning Repository [ 28 ], KEEL Repository [ 29 ], and Kaggle [ 30 ] are used. A real-world data set is also used for evaluation.

Fig 2 shows the configuration of the hierarchical agent-based framework used in the experiments. It consists of three groups, where each group contains three agents. The three agents are Random Forest (RF), Generalized Linear Model (GLM), and Gradient Boosting Machine (GBM), which have been selected based on extensive experiments of individual and group performances. Three agent managers are established, each with a BKS module. The prediction from these three agent managers are sent to the decision combination module that has another BKS to produce the final predicted class.

thumbnail

https://doi.org/10.1371/journal.pone.0260579.g002

Training is first conducted using randomized orders of the data samples, which is followed by a validation process. This in turn creates three group-based BKS modules (one for each group). The next step is combining the outputs from BKS modules 1 to 3 using training data with another randomized sequence, leading to the establishment of another overall (final) BKS module that combines the outputs from the previous three group-based BKS modules. Given a test sample, the group-based BKS outputs are combined again with the overall BKS module to produce a final predicted class for computation of the performance metrics, namely classification accuracy and F1-score.

Classification accuracy and F1-score of each experiment are recorded using Eqs ( 7 ) and ( 8 ), respectively.

literature review for credit card fraud detection

4.2 Benchmark data

A total of 10 data sets are used in the experiments. The details of each data set, i.e., B1 to B10, are shown in Table 4 , including the number of instances and features as well as the imbalanced ratio (IR) information.

thumbnail

https://doi.org/10.1371/journal.pone.0260579.t004

The accuracy rates and F1 scores are shown in Tables 5 and 6 , respectively. In general, the BKS results are slightly higher than those from majority voting for both performance indicators.

thumbnail

https://doi.org/10.1371/journal.pone.0260579.t005

thumbnail

https://doi.org/10.1371/journal.pone.0260579.t006

To evaluate the robustness of BKS, the data samples are corrupted with noise at 10% and 20% levels. A total of 25 runs are conducted for each data set, and the average results are listed in Table 7 . Fig 3 indicates the numbers of wins pertaining to the BKS against majority voting. The three bars for each dataset represent the data with no noise (-0), with 10% noise (-0.1), and with 20% noise (-0.2).

thumbnail

https://doi.org/10.1371/journal.pone.0260579.g003

thumbnail

https://doi.org/10.1371/journal.pone.0260579.t007

To evaluate whether BKS performs better than majority voting from the statistical perspective, a two-tailed sign test is used, as detailed in Section 4.1. Fig 3 shows the number of wins of BKS over majority voting from the experimental results (plotted at 16 wins and above). BKS achieves at least 18 wins out of 25 experimental runs in all ten noisy data sets (10% and 20% noise levels), indicating its superior performance over majority voting in undertaking noisy data samples for α = 0.05 (95% confidence level). When a more stringent statistical significance level of α = 0.01 (i.e., 99% confidence level) is used for evaluation, BKS outperforms majority voting in 9 out of 10 data sets with a noise level of 20%. This outcome positively indicates the usefulness of BKS over majority voting in mitigating the negative effect of noise in performance.

To ascertain the effectiveness of BKS with other methods in the literature, a comparison of the F1 score with the published results of GEP [ 26 ] and CUSBoost [ 33 ] is shown in Table 8 . CUSBoost [ 33 ] achieves the worst performance, while GEP [ 26 ] achieves close results as compared with those from BKS and majority voting. Overall, BKS achieves the highest F1 scores in four out of six data sets, while the scores of the remaining two are a little lower by 0.01 as compared with those of majority voting.

thumbnail

https://doi.org/10.1371/journal.pone.0260579.t008

4.3 Real-world data

This evaluation focuses on real financial transaction records (available in [ 34 ]) from September to November 2017 in a Southeast Asia financial firm. As indicated in [ 35 ], Southeast Asia is one of the fastest growing regions over the years, with a gross domestic product growth rate of over 6%. In this experiment, a total of 60,595 transaction records from 9,685 customers are available for evaluation. The transactions cover activities in 23 countries, with various spending items ranging from online website purchases to grocery shopping. A total of 28 transactions have been identified by the firm and labeled as fraud cases, with the remaining being genuine, or non-fraud cases.

Each transaction record consists of the account number, transaction amount, date, time, device type used, merchant category code (MCC), country, and type of transaction. The account number is anonymized to ensure privacy of customers. In addition to the nine original features, feature aggregation is conducted to generate eight new features. These aggregated features utilise the transaction amount, acquiring country, MCC, and device type over a period of three months. A summary of the features is shown in Table 9 .

thumbnail

https://doi.org/10.1371/journal.pone.0260579.t009

Feature importance scores can provide useful information of the data set. The scores can highlight the relevance of each feature for classification. Based on the 17 features, we carry out a feature importance study using the Decision Tree (DT), Random Forest (RF), and XGBoost classifiers. Fig 4 illustrates the results. It can be observed that all the features depict different levels of importance, and feature 12 (i.e., the count of unique acquiring country) appears to be the most important feature in all three classifiers. The remaining aggregated features (features 10 to 17) generally have slightly higher importance scores as compared with those of the original features.

thumbnail

https://doi.org/10.1371/journal.pone.0260579.g004

Similar to the benchmark data experiment, noise is added with increment of 10% to 40% to this real-world data set. Table 10 summarizes the results. BKS outperforms majority voting when the level of noise increases, indicating its robustness against noisy data. When the noise level increases to 20% and above, BKS outperforms majority voting 18 times (20% and 30% noise) and 19 times (40% noise), respectively. This outcome positively signifies the statistical superior performance of BKS over majority voting at 95% confidence level ( α = 0.05) for undertaking noisy data (20% noise and above) in this real-world experiment.

thumbnail

https://doi.org/10.1371/journal.pone.0260579.t010

Table 11 lists that F1 scores of the experiments. When no noise is added, the F1 scores for both BKS and voting are the same. Again, for noisy data sets, BKS consistently achieves higher F1 scores, as compared with those from majority voting.

thumbnail

https://doi.org/10.1371/journal.pone.0260579.t011

In addition to the experiments with additive noise, two experiments with under-sampling methods are conducted. Two different ratios of minority (fraud transactions) to majority (genuine transactions) are evaluated, i.e., 1:100 and 1:500, and the overall results are shown in Table 12 . Obviously, under-sampling does not help improve the voting results, while the use of 1:100 ratio enhances the BKS results slightly, as the data set is much more balanced, as compared to the original ratio.

thumbnail

https://doi.org/10.1371/journal.pone.0260579.t012

5. Conclusions

A multi-classifier system has been designed to address the classification challenge pertaining to credit card fraud. Specifically, the combination of a hierarchical agent-based framework with the BKS as a decision-making method has been constructed for classifying transaction records of credit cards into fraudulent and non-fraudulent cases. This combination allows the accumulation of knowledge and yields better results over time. To evaluate the proposed multi-classifier system, a series of experiments using publicly available data sets and real financial records have been conducted. The results from the ten benchmark data sets indicate the performance of BKS is better than that of the majority voting method for decision combination. In addition to noise-free data, noise up to 20% has been added to the data samples, in order to evaluate the robustness of the proposed method in noisy environments. Based on the statistical sign test, the BKS-based framework offers statistically superior performance over the majority voting method.

For the real transaction records from a financial firm, up to 40% noise has been added to the data samples. When the noise levels reach 20% and above, the BKS-based framework outperforms the majority voting method, with statistical significance at the 95% confidence level, as ascertained by the sign test. Based on the outcomes from both benchmark and real-world data, the proposed BKS-based framework is effective for detecting fraudulent credit card cases.

In future work, we will address several limitations of the current BKS models. Firstly, it is possible for the BKS table to contain empty cells, leading to no prediction for a given data sample. This observation generally occurs when the number of classifiers increases, i.e., a larger knowledge space is formed. In addition, noisy data sets, particularly noise in class labels, result in inaccurate information captured in the BKS cells, leading to erroneous predictions. We intend to exploit probabilistic methods, such as Bayesian inference, to interpret the BKS prediction and enhance its robustness in undertaking noisy data classification problems.

Additionally, we will investigate imbalanced data issues using a combination of over-sampling and under-sampling techniques. The effect of these different techniques toward classification performance will be analyzed and compared systematically using statistical hypothesis tests. We will also develop an online version of the proposed model. The model will be able to learn data samples on-the-fly and keep improving its prediction accuracy incrementally. This online learning model will be applied to various financial problems as well as other classification tasks.

  • 1. Weiss S. M., & Kulikowski C. A. (1991). Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems. Morgan Kaufmann Publishers Inc.
  • View Article
  • Google Scholar
  • PubMed/NCBI
  • 28. “UCI Machine Learning Repository,” [Online] Available: https://archive.ics.uci.edu/ml/datasets , 2020.
  • 29. “KEEL Data Set Repository,” [Online] Available: https://sci2s.ugr.es/keel/datasets.php , 2020.
  • 30. “Credit Card Fraud Detection,” [Online] Available: https://www.kaggle.com/mlg-ulb/creditcardfraud , 2020.
  • 31. Sheskin D. J. (2020). Handbook of parametric and nonparametric statistical procedures . CRC Press.
  • 34. “Transaction Records,” [Online] Available: https://doi.org/10.6084/m9.figshare.17119091 , 2021.
  • 35. Jiang C., & Yu W. (2018). Risk Control Theory of Online Transactions, Science Press, Beijing, China.

A Critical review of Credit Card Fraud Detection Techniques

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List

Logo of plosone

Credit card fraud detection using a hierarchical behavior-knowledge space model

Asoke k. nandi.

1 Department of Electronic and Electrical Engineering, Brunel University London, Uxbridge, UB8 3PH, United Kingdom

2 Visiting Professor, School of Electronic and Information Engineering, Tongji University, Shanghai, China

Kuldeep Kaur Randhawa

3 Faculty of Engineering, Computing and Science, Swinburne University of Technology (Sarawak Campus), Malaysia

Hong Siang Chua

Manjeevan seera.

4 Econometrics and Business Statistics, School of Business, Monash University Malaysia, Selangor, Malaysia

Chee Peng Lim

5 Institute for Intelligent Systems Research and Innovation, Deakin University, Geelong, Victoria, Australia

Associated Data

All relevant benchmark data are within the manuscript, given in references [ 24 ], [ 25 ], and [ 26 ]. Relevant real data records are available from a public repository: https://doi.org/10.6084/m9.figshare.17030138 .

With the advancement in machine learning, researchers continue to devise and implement effective intelligent methods for fraud detection in the financial sector. Indeed, credit card fraud leads to billions of dollars in losses for merchants every year. In this paper, a multi-classifier framework is designed to address the challenges of credit card fraud detections. An ensemble model with multiple machine learning classification algorithms is designed, in which the Behavior-Knowledge Space (BKS) is leveraged to combine the predictions from multiple classifiers. To ascertain the effectiveness of the developed ensemble model, publicly available data sets as well as real financial records are employed for performance evaluations. Through statistical tests, the results positively indicate the effectiveness of the developed model as compared with the commonly used majority voting method for combination of predictions from multiple classifiers in tackling noisy data classification as well as credit card fraud detection problems.

1. Introduction

Classification has been a key application area of machine learning. A classifier learns a mathematical model from training data samples that maps input features to the target classes or labels [ 1 ]. Given a new unseen data sample, the trained classifier is used to provide a prediction of the target class [ 2 ]. It is, however, not easy to use single or few input variables only to differentiate multiple classes to their fullest [ 1 ]. In many classifiers such as neural networks, k -nearest neighbors ( k NN), Support Vector Machine (SVM), and Naïve Bayes (NB), the underlying assumption is that training data samples contain a valid representation of the population of interest, which normally require a balanced sample distribution [ 3 ]. It has been empirically observed that building an accurate classifier based on a single paradigm is often ineffective, if not impossible [ 2 ].

Establishing an accurate classifier is not an easy task, as each classification method has its own advantages and disadvantages. As a result, the concept of classifier fusion using multiple classifiers has become one of the most significant methodologies to improve the classification performance. All classifiers provide their predictions of the class of an incoming data sample, and these predictions are analyzed and combined using some fusion strategy [ 4 ]. In this regard, selections of appropriate classifiers for constructing an ensemble classification model remain a difficult task [ 2 ].

It is a well-established notion in the literature that a classifier combination offers a viable alternative to yield better results than those from a single classifier. This is however dependent on how independent and diverse the classifiers are. Diversity among the chosen classifiers is an important factor for building a successful multi-classifier system (MCS). Various MCS methods have been proposed in modelling and handling different types of data [ 5 ]. Research in this area has led to the development of MCS models that combine the strengths of various individual classifiers, which are built using different training paradigms, to provide improved and robust classification performance [ 2 ].

With the rapid growth in e-commerce, the number of credit card transactions has been on the rise [ 6 ]. Alongside this growth, the issue of credit card fraud has become serious and complicated [ 7 ]. Generally, fraud detection solutions can be divided into supervised and unsupervised classification methods [ 8 ]. In supervised methods, the classification models are based on different samples of genuine and fraudulent transactions, while in unsupervised methods, outliers are detected from the data samples [ 9 ]. Merchants are responsible for paying the bill when a fraud occurs through an online or in-store transaction [ 10 ]. In this paper, we focus on the design and application of an ensemble classification model for credit card fraud detection, which is regarded as a significant problem in the financial sector. Indeed, billions of dollars are lost annually due to credit card fraud, and both merchants and consumers are significantly affected by the consequences of fraud [ 11 ]. With the advancement in fraud detection methods, fraudsters are finding new methods to avoid detection. Capturing irregular transaction patterns is a vital step in fraud detection [ 12 ], and efficient and effective classification methods are required for accurate detection of credit card frauds.

Two main methods are compared, namely majority voting and Behavior-Knowledge Space (BKS) [ 13 ] in this paper. Majority voting is simple but effective method, where an odd number of constituent classifiers is used for a decision in an ensemble. On the other hand, BKS considers the predictive accuracy of each classifiers and use this extra information to aggregate predictions from individual classifiers and derive better results. The main contribution of this paper is the formulation of an ensemble MCS model with the BKS for detection of real-world credit card fraud. The proposed model allows the MCS to accumulate its knowledge and yield better results over time.

The organization of this paper is as follows. A literature review on different types of MCS is presented in Section 2. Designs of the MCS model with BKS are explained in Section 3. A series of empirical evaluation on credit card fraud using publicly available data as well as real-world data from our collection is presented in Section 4. A summary of the findings is given in Section 5.

2. Literature review

An MCS model commonly includes a decision combination method for combining predictions from an ensemble of classifiers. A number of applications using MCS models have been developed over the years. In this section, we present a literature review on different classifier configurations, starting from two classifiers to four or more classifiers.

2.1 Two classifiers

An ensemble classification model using k NN and SVM was presented in [ 14 ] to classify electrocardiogram (ECG) signals. The proposed model achieved an accuracy score of 0.752 as compared with 0.561 to 0.737 from other classifiers [ 14 ]. In financial market trading, an automated framework was presented in [ 15 ], and an MCS model was used a weighted multi-category generalized eigenvalue SVM and Random Forest (RF) to generate the buy or sell signals. Evaluated with five index returns, including those from NASDAQ and DOW JONES, the MCS model achieved notable improvements over the buy/hold strategy as compared with the outcomes from other algorithms [ 15 ].

Predictions of severity pertaining to abnormal aviation events with risk levels were conducted in [ 16 ] using an MCS framework consisting of SVM and deep learning models. The SVM was used for discovering the relationships between event synopsis and consequences, while deep learning was deployed in training. Using cross-validation, the proposed MCS model achieved 81% accuracy, which are 3% and 6% higher than standalone SVM and deep learning models, respectively [ 16 ].

In [ 17 ], an MCS model based on dynamic weights was developed. The MCS model comprised a backpropagation neural network and the nearest neighbour algorithm, which dynamically assigned a fusion weight to a classifier. Using several public face databases, the proposed method obtained better classification accuracy rates as compared with those from individual classifiers [ 17 ]. An MCS model was proposed for face image segmentation in [ 18 ]. A total of three Bayes and one SVM were used in the MCS model. An error rate of 13.9% was achieved, as compared with 50% from standard classifiers, for hair across eyes requirements [ 18 ].

2.2 Three classifiers

In [ 2 ], an MCS was designed using stacked generalization based on DT (Decision Tree), k NN, and NB. A total of 20 different UCI data sets were used in the experiments. Based on a breast cancer data set, an accuracy rate of 74.8% was achieved by the MCS model, as compared with 71.2% from other classifiers [ 2 ]. An adaptive MCS model for gene expression was examined in [ 4 ]. Particle swarm optimization, bat-inspired algorithm, and SVM were used in the ensemble model, which showed significant improvements in classification performance with respect to breast cancer and embryonal tumors, where the training error reduced by up to 50% [ 4 ].

In [ 19 ], an MCS model to maximize the diagnostic accuracy of thyroid detection. The model utilized SVM, NB, k NN, and closest matching rule classifiers to yield the best diagnostic accuracy. The proposed system achieved an accuracy of 99.5% as compared with 99.1% from the best individual classifier in automatically discriminating thyroid histopathology images as either normal thyroid or papillary thyroid carcinoma [ 19 ]. An MCS framework to exploit unlabelled data was detailed in [ 20 ]. The MCS model was built using NB, SVM, and k NN. A total of five text classification data sets were used in the experiments. The highest accuracy rate of 83.3% was achieved by the MCS model, as compared with those from other algorithms [ 20 ].

2.3 Four or more classifiers

An adaptive MCS model for oil-bearing reservoir recognition was presented in [ 5 ]. A total of five classifiers were used, namely C4.5, SVM, radial basis function, data gravitation-based, and k NN algorithms. A number of rules were included in the adaptive MCS model as well. The proposed solution achieved perfect accuracy in recognizing the properties of different layers in the oil logging data [ 5 ]. An advanced warning system was designed in [ 21 ] using an MCS approach for outward foreign direct investment. Logistic regression, SVM, NN, and decision trees were used in the MCS model, which was applied to resource-based enterprises in China. The experimental results indicated the MCS model was able to yield an accuracy score of 85.1%, as compared with 82.5% from a standard neural network model [ 21 ].

In [ 22 ], estimations of precipitation from satellite images were carried out with an MCS model, which combined RF, NN, SVM, NB, weighted k NN, and k -means together. A total of six classes of precipitation intensities were obtained, from no rain to very high precipitation. A score of 0.93 for the coefficient of correlation was yielded by the proposed method, as compared with only 0.46 from other methods [ 22 ]. In [ 23 ], a one-against-one method was explored using MCS that consisted of NN, DT, k NN, SVM, linear discriminant analysis, and logistic regression. An error rate of 0.99% was produced by the MCS model, as compared with 14.9% from other methods on the zoo data set [ 23 ]. In [ 24 ], sentiments of tweets are automatically classified either positive or negative using an ensemble. Public tweet sentiment datasets are used in the experiment. The ensemble is formed using multinomial NB, SVM, RF, and logistic regression. An accuracy rate of 81.06% was achieved on a dataset trained with only 0.03% of the obtained data [ 24 ].

2.5 Remarks

Based on the above review that focuses on various classifier configurations (from two or more classifiers), it is clear that MCS has been used in various applications, including finance, medical, engineering and other sectors. The MCS configuration offers the advantage that the output is not constrained by one classifier, with a pool of classifiers to provide the possibility of improved results. In the event that one classifier produces an incorrect prediction while other counterparts yield a correct one, the combined output can be correct, e.g. in accordance with the majority voting principle. The combined output is, therefore, able to reduce the number of incorrect predictions from single classification method. The results from various MCS configurations reported in the literature are promising, with typically higher accuracy rates. However, MCS-based methods tend to run slower, since a higher computation load is required for execution of multiple classifiers, although this is not regularly reported in the literature. While better results often outweigh longer computational durations, it is useful to ensure that MCS configurations are feasible in terms of computational requirements for practical applications in real-world environments.

3. Classification methods

In this study, several standard machine learning models from H2O.ai were employed to establish an MCS model. The Python software running on the Google Colab environment was used. In the following sub-sections, the majority voting and the BKS model by Huang and Suen [ 25 ] for decision combination is explained.

3.1 Majority voting

Given M target classes in which each class is represented by C i , ∀ i ∈ Λ = { 1 , 2 ,…, M }. The classifier task is to categorize an input sample, x , to one of the ( M + 1 ) classes, with the ( M + 1 )th class denoting that the classifier rejects x .

A commonly used method for combining multiple classifier outputs is by majority voting. If there are K classifiers, denoted by e 1 ,…, e K , the task is to produce a combined result, E ( x ) = j , j ∈{ 1 , 2 ,…, M , M + 1 } from all K predictions, e k ( x ) = j k , k = 1 ,…, K . The number of votes can be computed using a binary function [ 26 ], i.e.,

Then, sum the votes from all K classifiers for each C i

and the combined result, E ( x ), can be determined by

where 0 ≤ λ ≤ 1 is a user-defined threshold that controls the confidence in the final decision [ 27 ].

A BKS is a K -dimensional space, where every dimension indicates the decision (i.e., predicted class) from one classifier. The intersection of the decisions from K different classifiers occupies one unit in the BKS, e.g., BKS ( e 1 ( x ) = j 1 ,…, e K ( x ) = j K ) denotes a unit where each e k produces a prediction j k , k = 1 ,…, K . In each BKS unit, there are M partitions (cells), which accumulate the number of data samples actually belonging to C i .

Consider an example with two classifiers. A two-dimensional (2–D) BKS can be formed, as given in Table 1 .

Every BKS unit, U ij , contains M cells, i.e., n 1 H , … , n M H , where H represents the overall prediction e 1 ( x ) = j 1 ,…, e K ( x ) = j K . The total number of data samples belonging to each class is recorded in each n 1 H , i = 1 ,…, M . When an input sample, x , is shown, one of the BKS units is activated (also known as the focal unit) after obtaining the decisions from all K classifiers. As an example, U 34 becomes active as the focal unit if e 1 ( x ) = 3 and e 2 ( x ) = 4 . The total number of samples in the focal unit can be obtained by using

and the one with the highest number of samples is identified

The decision rule for determining the final outcome is

where 0 ≤ λ ≤ 1 is a user-defined confidence threshold.

The BKS has similarity with the confusion matrix. With the Bayesian approach, multiplication of evidence from the confusion matrices is required to estimate the joint probability of K events when combining the predictions. This step is eliminated in the BKS method, where a final decision is reached by giving the input sample directly to the class that has gathered the greatest number of samples. This simple method of BKS gives a fast and efficient method for combining various decisions, as shown in [ 25 ] for classification of unconstrained handwritten numerals.

A hierarchical agent-based framework with the BKS for decision combination is proposed. As shown in Fig 1 , the framework has N agent groups in the base layer, with each group comprises multiple individual agents. The agents can be machine learning models, statistical methods as well as other classification algorithms. A manager agent is assigned to combine the predictions from each agent group using a BSK. Each manager agent sends its prediction to a decision combination module comprising another BKS in the top layer that produces the final combined prediction.

An external file that holds a picture, illustration, etc.
Object name is pone.0260579.g001.jpg

A numerical example is presented to better illustrate the BKS mechanism. In Table 2 , a simple binary classification problem is shown. There are two agents (classifiers) and six input samples, along with their predicted and actual classes. A BKS can be constructed, as shown in Table 3 . As an example, for input samples 1 and 4 ( Table 2 ), both agents 1 and 2 predict class 1, and the actual class is 1. This information is recorded in the highlighted (grey) BKS unit in Table 3 . Given a new test sample, the predictions from all agents are used to activate a BKS unit, and the combined predicted class (final output) is reached based on the highest number of samples from the majority class, as given in Eq ( 5 ). Whenever the highlighted (grey) BKS unit is activated during the test phase, the combined (final) prediction is Class 1.

4. Experiments

In this empirical evaluation, publicly available data sets from UCI Machine Learning Repository [ 28 ], KEEL Repository [ 29 ], and Kaggle [ 30 ] are used. A real-world data set is also used for evaluation.

Fig 2 shows the configuration of the hierarchical agent-based framework used in the experiments. It consists of three groups, where each group contains three agents. The three agents are Random Forest (RF), Generalized Linear Model (GLM), and Gradient Boosting Machine (GBM), which have been selected based on extensive experiments of individual and group performances. Three agent managers are established, each with a BKS module. The prediction from these three agent managers are sent to the decision combination module that has another BKS to produce the final predicted class.

An external file that holds a picture, illustration, etc.
Object name is pone.0260579.g002.jpg

Training is first conducted using randomized orders of the data samples, which is followed by a validation process. This in turn creates three group-based BKS modules (one for each group). The next step is combining the outputs from BKS modules 1 to 3 using training data with another randomized sequence, leading to the establishment of another overall (final) BKS module that combines the outputs from the previous three group-based BKS modules. Given a test sample, the group-based BKS outputs are combined again with the overall BKS module to produce a final predicted class for computation of the performance metrics, namely classification accuracy and F1-score.

Classification accuracy and F1-score of each experiment are recorded using Eqs ( 7 ) and ( 8 ), respectively.

For performance comparison between majority voting and BKS statistically, the sign test [ 31 ] is adopted. In the sign test, the number of wins is spread based on a binomial distribution. Given a large number of cases, the number of wins under the null hypothesis is distributed according to n ( n 2 , n 2 ) , allowing the use of the z -test, i.e., should the number of wins be at least ( n 2 + 1.96 n 2 ) , then the outcome is statistically significant with p < 0.05. The number of wins required for a comparison of k = 25 experimental results are [ 32 ]: 18 wins for (the significance level) α = 0.05 (i.e., 95% confidence level) and 17 wins for a less stringent α = 0.1 (i.e., 90% confidence level), respectively. In addition, for a more stringent setting of α = 0.01 (i.e., 99% confidence interval), a total of 19 wins is required.

4.2 Benchmark data

A total of 10 data sets are used in the experiments. The details of each data set, i.e., B1 to B10, are shown in Table 4 , including the number of instances and features as well as the imbalanced ratio (IR) information.

The accuracy rates and F1 scores are shown in Tables ​ Tables5 5 and ​ and6, 6 , respectively. In general, the BKS results are slightly higher than those from majority voting for both performance indicators.

To evaluate the robustness of BKS, the data samples are corrupted with noise at 10% and 20% levels. A total of 25 runs are conducted for each data set, and the average results are listed in Table 7 . Fig 3 indicates the numbers of wins pertaining to the BKS against majority voting. The three bars for each dataset represent the data with no noise (-0), with 10% noise (-0.1), and with 20% noise (-0.2).

An external file that holds a picture, illustration, etc.
Object name is pone.0260579.g003.jpg

To evaluate whether BKS performs better than majority voting from the statistical perspective, a two-tailed sign test is used, as detailed in Section 4.1. Fig 3 shows the number of wins of BKS over majority voting from the experimental results (plotted at 16 wins and above). BKS achieves at least 18 wins out of 25 experimental runs in all ten noisy data sets (10% and 20% noise levels), indicating its superior performance over majority voting in undertaking noisy data samples for α = 0.05 (95% confidence level). When a more stringent statistical significance level of α = 0.01 (i.e., 99% confidence level) is used for evaluation, BKS outperforms majority voting in 9 out of 10 data sets with a noise level of 20%. This outcome positively indicates the usefulness of BKS over majority voting in mitigating the negative effect of noise in performance.

To ascertain the effectiveness of BKS with other methods in the literature, a comparison of the F1 score with the published results of GEP [ 26 ] and CUSBoost [ 33 ] is shown in Table 8 . CUSBoost [ 33 ] achieves the worst performance, while GEP [ 26 ] achieves close results as compared with those from BKS and majority voting. Overall, BKS achieves the highest F1 scores in four out of six data sets, while the scores of the remaining two are a little lower by 0.01 as compared with those of majority voting.

4.3 Real-world data

This evaluation focuses on real financial transaction records (available in [ 34 ]) from September to November 2017 in a Southeast Asia financial firm. As indicated in [ 35 ], Southeast Asia is one of the fastest growing regions over the years, with a gross domestic product growth rate of over 6%. In this experiment, a total of 60,595 transaction records from 9,685 customers are available for evaluation. The transactions cover activities in 23 countries, with various spending items ranging from online website purchases to grocery shopping. A total of 28 transactions have been identified by the firm and labeled as fraud cases, with the remaining being genuine, or non-fraud cases.

Each transaction record consists of the account number, transaction amount, date, time, device type used, merchant category code (MCC), country, and type of transaction. The account number is anonymized to ensure privacy of customers. In addition to the nine original features, feature aggregation is conducted to generate eight new features. These aggregated features utilise the transaction amount, acquiring country, MCC, and device type over a period of three months. A summary of the features is shown in Table 9 .

Feature importance scores can provide useful information of the data set. The scores can highlight the relevance of each feature for classification. Based on the 17 features, we carry out a feature importance study using the Decision Tree (DT), Random Forest (RF), and XGBoost classifiers. Fig 4 illustrates the results. It can be observed that all the features depict different levels of importance, and feature 12 (i.e., the count of unique acquiring country) appears to be the most important feature in all three classifiers. The remaining aggregated features (features 10 to 17) generally have slightly higher importance scores as compared with those of the original features.

An external file that holds a picture, illustration, etc.
Object name is pone.0260579.g004.jpg

Similar to the benchmark data experiment, noise is added with increment of 10% to 40% to this real-world data set. Table 10 summarizes the results. BKS outperforms majority voting when the level of noise increases, indicating its robustness against noisy data. When the noise level increases to 20% and above, BKS outperforms majority voting 18 times (20% and 30% noise) and 19 times (40% noise), respectively. This outcome positively signifies the statistical superior performance of BKS over majority voting at 95% confidence level ( α = 0.05) for undertaking noisy data (20% noise and above) in this real-world experiment.

Table 11 lists that F1 scores of the experiments. When no noise is added, the F1 scores for both BKS and voting are the same. Again, for noisy data sets, BKS consistently achieves higher F1 scores, as compared with those from majority voting.

In addition to the experiments with additive noise, two experiments with under-sampling methods are conducted. Two different ratios of minority (fraud transactions) to majority (genuine transactions) are evaluated, i.e., 1:100 and 1:500, and the overall results are shown in Table 12 . Obviously, under-sampling does not help improve the voting results, while the use of 1:100 ratio enhances the BKS results slightly, as the data set is much more balanced, as compared to the original ratio.

5. Conclusions

A multi-classifier system has been designed to address the classification challenge pertaining to credit card fraud. Specifically, the combination of a hierarchical agent-based framework with the BKS as a decision-making method has been constructed for classifying transaction records of credit cards into fraudulent and non-fraudulent cases. This combination allows the accumulation of knowledge and yields better results over time. To evaluate the proposed multi-classifier system, a series of experiments using publicly available data sets and real financial records have been conducted. The results from the ten benchmark data sets indicate the performance of BKS is better than that of the majority voting method for decision combination. In addition to noise-free data, noise up to 20% has been added to the data samples, in order to evaluate the robustness of the proposed method in noisy environments. Based on the statistical sign test, the BKS-based framework offers statistically superior performance over the majority voting method.

For the real transaction records from a financial firm, up to 40% noise has been added to the data samples. When the noise levels reach 20% and above, the BKS-based framework outperforms the majority voting method, with statistical significance at the 95% confidence level, as ascertained by the sign test. Based on the outcomes from both benchmark and real-world data, the proposed BKS-based framework is effective for detecting fraudulent credit card cases.

In future work, we will address several limitations of the current BKS models. Firstly, it is possible for the BKS table to contain empty cells, leading to no prediction for a given data sample. This observation generally occurs when the number of classifiers increases, i.e., a larger knowledge space is formed. In addition, noisy data sets, particularly noise in class labels, result in inaccurate information captured in the BKS cells, leading to erroneous predictions. We intend to exploit probabilistic methods, such as Bayesian inference, to interpret the BKS prediction and enhance its robustness in undertaking noisy data classification problems.

Additionally, we will investigate imbalanced data issues using a combination of over-sampling and under-sampling techniques. The effect of these different techniques toward classification performance will be analyzed and compared systematically using statistical hypothesis tests. We will also develop an online version of the proposed model. The model will be able to learn data samples on-the-fly and keep improving its prediction accuracy incrementally. This online learning model will be applied to various financial problems as well as other classification tasks.

Funding Statement

The author(s) received no specific funding for this work.

Data Availability

  • Study Guides
  • Homework Questions

Credit card fraud

  • Information Systems

The Federal Register

The daily journal of the united states government, request access.

Due to aggressive automated scraping of FederalRegister.gov and eCFR.gov, programmatic access to these sites is limited to access to our extensive developer APIs.

If you are human user receiving this message, we can add your IP address to a set of IPs that can access FederalRegister.gov & eCFR.gov; complete the CAPTCHA (bot test) below and click "Request Access". This process will be necessary for each IP address you wish to access the site from, requests are valid for approximately one quarter (three months) after which the process may need to be repeated.

An official website of the United States government.

If you want to request a wider IP range, first request access for your current IP, and then use the "Site Feedback" button found in the lower left-hand side to make the request.

VIDEO

  1. Credit Card Fraud Detection (BIG DATA) Risk Sentinel

  2. Credit Card Fraud Detection@ITT

  3. 91 Credit Card Fraud Detection with CNN Step 2

  4. Credit Card Fraud Detection AI

  5. Transfer Learning Strategies for Credit Card Fraud Detection

  6. CODSOFT INTERNSHIP PROJECT 3

COMMENTS

  1. (PDF) Credit Card Fraud Detection

    This is a systematic literature review to reflect the previous studies that dealt with credit card fraud detection and highlight the different machine learning techniques to deal with this problem ...

  2. A systematic review of literature on credit card cyber fraud detection

    Credit card fraud detection using weighted support vector machine. Journal: 2020: Zhang, Bhandari & Black (2020) A137: Machine learning methods for analysis fraud credit card transaction. Journal: 2019: Saragih et al. (2019) A138: A review on credit card fraud detection using machine learning. Journal: 2019: Shirgave et al. (2019) A139

  3. Review of Machine Learning Approach on Credit Card Fraud Detection

    Massive usage of credit cards has caused an escalation of fraud. Usage of credit cards has resulted in the growth of online business advancement and ease of the e-payment system. The use of machine learning (methods) are adapted on a larger scale to detect and prevent fraud. ML algorithms play an essential role in analysing customer data. In this research article, we have conducted a ...

  4. Credit card fraud detection in the era of disruptive technologies: A

    Though largely tested for credit card fraud detection, both KNN and SVM are computationally expensive and may show reduced performance in detecting credit card fraud for large datasets. One of the unsupervised analogy-based solutions suggested in the literature to detect fraudulent transactions is the use of recommender systems.

  5. Credit Card Fraud Detection: A Systematic Review

    Credit Card Fraud Detection (CCFD) is a challenging research undergone by the research community as the fraudsters change their behavioral pattern now and then which becomes an alarm for the banks to set a solution. ... Shami, A., Essex, A.: Data mining techniques in intrusion detection systems: a systematic literature review. IEEE Access 6 ...

  6. PDF A systematic review of literature on credit card cyber fraud detection

    How to cite this article Marazqah Btoush EAL, Zhou X, Gururajan R, Chan KC, Genrich R, Sankaran P. 2023. A systematic review of literature on credit card cyber fraud detection using machine and deep learning. PeerJ Comput. Sci. 9:e1278 DOI 10.7717/peerj-cs.1278 Submitted 28 December 2022 Accepted 15 February 2023 Published 17 April 2023

  7. Credit Card Fraud Detection using Machine Learning Algorithms

    Abstract. Credit card frauds are easy and friendly targets. E-commerce and many other online sites have increased the online payment modes, increasing the risk for online frauds. Increase in fraud rates, researchers started using different machine learning methods to detect and analyse frauds in online transactions.

  8. Fraud detection and prevention in e-commerce: A systematic literature

    Credit card fraud detection is a common topic some studies address. For instance, Sorournejad et al. (2016) review credit card fraud detection techniques into two categories: supervised or unsupervised. They present a taxonomy detailing the different techniques found in the literature with a focus on the two categories.

  9. Credit Card Fraud Detection using Machine Learning: A Systematic

    A systematic literature review that systematically reviews and synthesizes the existing literature on machine learning (ML)-based fraud detection showed that support vector machine and artificial neural network are popular ML algorithms used for fraud detection, and credit card fraud is the most popular fraud type addressed using ML techniques.

  10. PDF Literature Review On Identification Of Fraudulent Credit Card Fraud

    The purpose of this literature review is to provide an overview of current research and achievements in credit card fraud detection. It discusses the primary approaches, algorithms, and datasets ... They conducted a survey on credit card fraud detection, taking into account the three main types of fraud: insurance, corporate, and bank. The two ...

  11. Credit Card Fraud Detection Techniques: A Review

    Because there's no literature on credit card fraud algorithms, benefits and drawbacks will be present, and their limitations will be hidden. ... Mehndiratta S, Gupta K (2018) Credit card fraud detection techniques: a review. IJCSMC 8(8) Google Scholar Kazemi ZH (2017) Using deep networks for fraud detection in the credit card transactions. In ...

  12. Credit card fraud detection using a hierarchical behavior ...

    With the advancement in machine learning, researchers continue to devise and implement effective intelligent methods for fraud detection in the financial sector. Indeed, credit card fraud leads to billions of dollars in losses for merchants every year. In this paper, a multi-classifier framework is designed to address the challenges of credit card fraud detections. An ensemble model with ...

  13. Review on Credit Card Fraud Detection Techniques

    Online transactions have taken the world by storm in today's society and credit cards stand one of the maximum used expense methods. Because of this popularity, fraud has arisen in this field, which is called credit card fraud. Credit card fraud has become a worldwide concern. This research work examines the identification of credit card fraud techniques. Different machine learning, data ...

  14. Credit Card Fraud Detection Using Machine Learning

    Reports of Credit card fraud in the US rose by 44.7% from 271,927 in 2019 to 393,207 reports in 2020. There are two kinds of credit card fraud, the first one is by having a credit card account opened under your name by an identity thief, reports of this fraudulent behavior increased 48% from 2019 to 2020.

  15. PDF Literature Review of Different Machine Learning Algorithms for Credit

    Literature Review of Different Machine Learning Algorithms for Credit Card Fraud Detection. Nayan Uchhana, Ravi Ranjan, Shashank Sharma, Deepak Agrawal, Anurag Punde. Abstract: Every year fraud cost generated in the economy is more than $4 trillion internationally. This is unsurprising, as the return on investment for fraud can be massive.

  16. A Review of Credit Card Fraud Detection Using Machine Learning

    Therefore, the emergence of the credit card use and the increasing number of fraudsters have generated different issues that concern the banking sector. Unfortunately, these issues obstruct the performance of Fraud Control Systems (Fraud Detection Systems & Fraud Prevention Systems) and abuse the transparency of online payments.

  17. Credit Card Fraud Detection Using Machine Learning: A Review

    DOI: 10.22214/ijraset.2023.55377 Corpus ID: 261020326; Credit Card Fraud Detection Using Machine Learning: A Review @article{2023CreditCF, title={Credit Card Fraud Detection Using Machine Learning: A Review}, author={Mehvish . and Satish Saini and Ravinder Pal Singh}, journal={International Journal for Research in Applied Science and Engineering Technology}, year={2023}, url={https://api ...

  18. Review Financial Fraud: A Review of Anomaly Detection Techniques and

    In 2012, Zareapoor et al. conducted a survey focusing on the specific statistical and machine learning techniques most commonly used for credit card fraud detection (Zareapoor, 2012). In this literature, a historical background is given on each technique and a high-level overview of how they work or operate in credit card fraud detection systems.

  19. A Critical review of Credit Card Fraud Detection Techniques

    Abstract: Credit card fraud is one of the most important threats that affect people as well as companies across the world, particularly with the growing volume of financial transactions using credit cards every day. This puts the security of financial transactions at serious risk and calls for a fundamental solution. In this paper, we discuss various techniques of credit card fraud detection ...

  20. PDF Credit Card Fraud Detection Techniques: A Review

    In the first phase, the feature extraction technique is applied and in the second phase, classification is applied for the fraud transaction detection. In this review paper various techniques of credit card fraud detection are reviewed. In future hybrid approach will be designed for the credit card fraud detection.

  21. An intelligent payment card fraud detection system

    Aggregated features. In this section, we review feature aggregation for fraud detection. Among various feature aggregation methods, feature averaging summarizes the cardholder activities by comparing the spending habits and patterns (Russac et al., 2018).In Bahnsen et al. (), a credit card-related database for fraud detection was examined.By analyzing the periodic behaviors over time, an ...

  22. PDF Announces the Ph.D. Dissertation Defense of Azadeh Abdollah Zadeh

    analysis involves utilizing not only the Credit Card Fraud Detection Dataset but also the Medicare Part D dataset. The findings show the comparative ... N. Seliya, A. Abdollah Zadeh, and T. M. Khoshgoftaar. A literature review on one-class classification and its potential applications in big data. Journal of Big Data, 8(1):1-31, 2021.

  23. Credit card fraud detection using a hierarchical behavior-knowledge

    In this paper, we focus on the design and application of an ensemble classification model for credit card fraud detection, which is regarded as a significant problem in the financial sector. Indeed, ... The organization of this paper is as follows. A literature review on different types of MCS is presented in Section 2. Designs of the MCS model ...

  24. Credit card fraud (docx)

    While credit card fraud is a pervasive problem, there are several steps you can take to protect yourself and minimize the risk of falling victim to fraudulent activity: 1. **Monitor Your Accounts**: Regularly review your credit card statements and transaction history for any unauthorized or suspicious activity. Report any discrepancies or unfamiliar charges to your card issuer immediately.

  25. Federal Register :: Credit Card Penalty Fees (Regulation Z)

    Based on a 2022 review of about 2,500 credit card agreements from over 500 card issuers (as discussed in part II.E), the CFPB also noted that smaller issuers appeared to charge lower late fee amounts, and therefore, any reduction in late fee amounts would have a proportionately smaller impact on their late fee income.