Data Attribute Selection with Information Gain to Improve Credit Approval Classification Performance using K-Nearest Neighbor Algorithm

  • Ivandari Ivandari STMIK Widya Pratama Pekalongan, Indonesia
  • Tria Titiani Chasanah STIMIK Widya Pratama Pekalongan
  • Sattriedi Wahyu Binabar STMIK Widya Pratama Pekalongan
  • M. Adib Al Karomi STMIK Widya Pratama Pekalongan

Abstract

Credit is one of the modern economic behaviors. In practice, credit can be either borrowing a certain amount of money or purchasing goods with a gradual payment process and within an agreed timeframe. Economic conditions that are less supportive and high community needs make people choose to buy goods with this credit process. Unfortunately the high needs sometimes are not in line with the ability to make payments in accordance with the initial agreement. Such condition causes the payment process to be disrupted or also called the term “bad credit”. This research uses public data of credit card dataset from UCI repository and private data that is dataset of credit approval from local banking. The information gain algorithm is used to calculate the weights of each of the attributes. From the calculation results note that all attributes have different weights. This study resulted in the conclusion that not all data attributes influence the classification result. Suppose attribute A1 to UCI dataset as well as loan type attribute on local dataset that has information gain weight 0 (zero). The result of classification using K-Nearest Neighbors algorithm shows that there is an increase of 7.53% for UCI dataset and 3.26% for local dataset after feature selection on both datasets.

Downloads

Download data is not yet available.

References

Alkaromi, M. A. (2014). Information Gain untuk Pemilihan Fitur pada Klasifikasi Heregistrasi Calon Mahasiswa dengan Menggunakan K-NN.

Alpaydin, E. (2010). Introduction to Machine Learning Second Edition. London: The MIT Press.

Amancio, D. R., Comin, C. H., Casanova, D., Travieso, G., Bruno, O. M., Rodrigues, F. a., & Costa, L. D. F. (2013). A systematic comparison of supervised classifiers. Retrieved from http://arxiv.org/abs/1311.0202v1.

Ashari, A., Paryudi, I., & Tjoa, A. M. (2013). Performance Comparison between Naïve Bayes , Decision Tree and k-Nearest Neighbor in Searching Alternative Design in an Energy Simulation Tool, 4(11), 33–39.

Azhagusundari, B., & Thanamani, A. S. (2013). Feature Selection based on Information Gain, (2), 18–21.

Han, J., & Kamber, M. (2006). Data Mining: Concepts and Techniques Second Edition. Elsevier. Elsevier.

Ian H Witten. Eibe Frank. Mark A Hall. (2011). Data Mining 3rd.

Karegowda, A. G., Manjunath, A. S., & Jayaram, M. A. (2010). Comparative Study of Attribute Selection using Gain Ratio and Correlation Based Feature Selection. International Journal of Information Technology and Knowledge Management, 2(2), 271–277.

Koprinska, I. (2010). Feature Selection for Brain-Computer Interfaces, 100–111.

Larose, D. T. (2005). Discovering Knowledge in Data: an Introduction to Data Mining. John Wiley & Sons.

Maimoon. (2010). Data Mining and Knowledge Discovery Handbook.

Maulana, M. R., & Al Karomi, M. A. (2016). Sistem Pendukung Keputusan Persetujuan Kredit Menggunakan Algoritma C4.5. Jurnal IC-Tech, Vol. XI No(1), 29–38. Retrieved from http://jurnal.stmik-wp.ac.id/gdl.php?mod=browse&op=read&id=ictech--muchrifqim-80.

Patel, K., Vala, J., & Pandya, J. (2014). Comparison of various classification algorithms on iris datasets using WEKA, 1(1), 1–7.

Prasetyo, E. (2012). Data Mining Konsep dan Aplikasi menggunakan Matlab. Yogyakarta: Andi Offset.

Ragab, A. H. M., Noaman, A. Y., Al-Ghamdi, A. S., & Madbouly, A. I. (2014). A Comparative Analysis of Classification Algorithms for Students College Enrollment Approval Using Data Mining. Proceedings of the 2014 Workshop on Interaction Design in Educational Environments - IDEE ’14, 106–113. https://doi.org/10.1145/2643604.2643631.

Santosa, B. (2007). Data Mining Teknik Pemanfaatan Data untuk Keperluan Bisnis (Edisi Pert). Yogyakarta: Graha Ilmu.

Wu, X., Kumar, V., Ross Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., … Steinberg, D. (2007). Top 10 algorithms in data mining. Knowledge and Information Systems (Vol. 14). https://doi.org/10.1007/s10115-007-0114-2.
Published
2017-10-17
How to Cite
IVANDARI, Ivandari et al. Data Attribute Selection with Information Gain to Improve Credit Approval Classification Performance using K-Nearest Neighbor Algorithm. International Journal of Islamic Business and Economics (IJIBEC), [S.l.], p. 13-22, oct. 2017. ISSN 2615-420X. Available at: <http://e-journal.iainpekalongan.ac.id/index.php/IJIBEC/article/view/882>. Date accessed: 15 aug. 2018. doi: https://doi.org/10.28918/ijibec.v1i1.882.
Section
Articles

Keywords

KNN Accuracy, Feature selection, Credit Approval