Accuracy Analysis of K-Nearest Neighbor and Naïve Bayes Algorithm in the Diagnosis of Breast Cancer

Main Article Content

Irma Handayani
Ikrimach Ikrimach


In the medical field, there are many records of disease sufferers, one of which is data on breast cancer. An extraction process to fine information in previously unknown data is known as data mining. Data mining uses pattern recognition techniques such as statistics and mathematics to find patterns from old data or cases. One of the main roles of data mining is classification. In the classification dataset, there is one objective attribute or it can be called the label attribute. This attribute will be searched from new data on the basis of other attributes in the past. The number of attributes can affect the performance of an algorithm. This results in if the classification process is inaccurate, the researcher needs to double-check at each previous stage to look for errors. The best algorithm for one data type is not necessarily good for another data type. For this reason, the K-Nearest Neighbor and Naïve Bayes algorithms will be used as a solution to this problem. The research method used was to prepare data from the breast cancer dataset, conduct training and test the data, then perform a comparative analysis. The research target is to produce the best algorithm in classifying breast cancer, so that patients with existing parameters can be predicted which ones are malignant and benign breast cancer. This pattern can be used as a diagnostic measure so that it can be detected earlier and is expected to reduce the mortality rate from breast cancer. By making comparisons, this method produces 95.79% for K-Nearest Neighbor and 93.39% for Naïve Bayes


Download data is not yet available.

Article Details

How to Cite
I. Handayani and I. Ikrimach, “Accuracy Analysis of K-Nearest Neighbor and Naïve Bayes Algorithm in the Diagnosis of Breast Cancer”, INFOTEL, vol. 12, no. 4, Nov. 2020.


[1] G. I. Salama, M. B. Abdelhalim, and M. A. E. Zeid, “Experimental Comparison Of Classifiers For Breast Cancer Diagnosis,” Proc. - ICCES 2012 2012 Int. Conf. Comput. Eng. Syst., no. November, pp. 180–185, 2012.
[2] E. S. Wahyuni, “Penerapan Metode Seleksi Fitur Untuk Meningkatkan Hasil Diagnosis Kanker Payudara,” Simetris J. Tek. Mesin, Elektro dan Ilmu Komput., vol. 7, no. 1, p. 283, 2016.
[3] A. Buditjahjanto, “Determination of the Type of Heart Syndrome in Traditional Chinese Medicine with the Bayesian Network Method,” J. Infotel, vol. 12, no. 2, pp. 32–38, 2020.
[4] F. Gemci and T. Ibrikci, “Tumor Type Detection Using Naive Bayes Algorithm on Gene Expression Cancer RNA-Seq Data Set,”International Conference on Engineering Technologies (ICENTE'17), 2017.
[5] B. Saçlı et al., “Microwave dielectric property based classification of renal calculi: Application of a kNN algorithm,” Comput. Biol. Med., vol. 112, no. January, 2019.
[6] R. Shinde, S. Arjun, P. Patil, and P. J. Waghmare, “An Intelligent Heart Disease Prediction System Using K-Means Clustering and Naïve Bayes Algorithm,” Int. J. Comput. Sci. Inf. Technol., vol. 6, no. 1, pp. 637–639, 2015.
[7] N. Salmi and Z. Rustam, “Naïve Bayes Classifier Models for Predicting the Colon Cancer,” IOP Conf. Ser. Mater. Sci. Eng., vol. 546, no. 5, 2019.
[8] H. Parveen and S. Pandey, “Sentiment analysis on Twitter Data-set using Naive Bayes algorithm,” 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT), Bangalore, 2016, pp. 416-419, 2016.
[9] Y. Ma, S. Liang, X. Chen and C. Jia, “The Approach to Detect Abnormal Access Behavior Based on Naive Bayes Algorithm,” 2016 10th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS), Fukuoka, 2016, pp. 313-315.
[10] Y. Tan, “An Improved KNN Text Classification Algorithm Based on K-Medoids and Rough Set,” 2018 10th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), Hangzhou, 2018, pp. 109-113.
[11] I. Handayani, “Application of K-Nearest Neighbor Algorithm on Classification of Disk Hernia and Spondylolisthesis in Vertebral Column,” Indones. J. Inf. Syst., vol. 2, no. 1, p. 57, 2019.
[12] D. A. Nasution, H. H. Khotimah, and N. Chamidah, “Perbandingan Normalisasi Data untuk Klasifikasi Wine Menggunakan Algoritma K-NN,” Comput. Eng. Sci. Syst. J., vol. 4, no. 1, p. 78, 2019.
[13] I. H. Witten, E. Frank, and M. a Hall, Data Mining: Practical Machine Learning Tools and Techniques, Third Edition (The Morgan Kaufmann Series in Data Managemeny Systems). Burlington: Elsevie, 2011.
[14] Trevor Hastie Robert TibshiraniJerome Friedman, “The Elements of Statistical Learning” (2nd en., web version),” Math. Intell., pp.269-370, 2008.
[15] K. Polat and S. Güneş, “Breast cancer diagnosis using least square support vector machine,” Digit. Signal Process. A Rev. J., vol. 17, no. 4, pp. 694–701, 2007.
[16] F. Gorunescu, Data Mining: Concept, Model and Techniques. Heidelberg, Berlin: Springer, 2011.