The Development of Stacking Techniques in Machine Learning for Breast Cancer Detection

Lucky Lhaura Van FC, M. Khairul Anam, Saiful Bukhori, Abd Kadir Mahamad, Sharifah Saon, Rebecca La Volla Nyoto

Abstract


This study addresses the challenges of accurately detecting breast cancer using machine learning (ML) models, particularly when handling imbalanced datasets that often cause model bias toward the majority class. To tackle this, the Synthetic Minority Over-sampling Technique (SMOTE) was applied not only to balance the class distribution but also to improve the model's sensitivity in detecting malignant tumors, which are underrepresented in the dataset. SMOTE was effective in generating synthetic samples for the minority class without introducing overfitting, enhancing the model's generalization on unseen data. Additionally, AdaBoost was employed as the meta model in the stacking framework, chosen for its ability to focus on misclassified instances during training, thereby boosting the overall performance of the combined base models. The study evaluates several models and combinations, with K-Nearest Neighbors (KNN) + SMOTE achieving an accuracy of 97%, precision, recall, and F1-score of 97%. Similarly, C4.5 + Hyperparameter Tuning + SMOTE reached 95% in all metrics. The stacking model with Logistic Regression (LR) as the meta model and SMOTE achieved a strong performance with 97% accuracy, precision, recall, and F1-score all at 97%. The best result was obtained using the combination of Stacking AdaBoost + Hyperparameter Tuning + SMOTE, reaching an accuracy of 98%. These findings highlight the effectiveness of combining SMOTE with stacking techniques to develop robust predictive models for medical applications. The novelty of this study lies in the integration of SMOTE and advanced stacking methods, particularly using AdaBoost and Logistic Regression, to address the issue of class imbalance in medical datasets. Future work will explore deploying this model in clinical settings for accurate and timely breast cancer detection.


Article Metrics

Abstract: 245 Viewers PDF: 186 Viewers

Keywords


Machine Learning; Stacking; Adaboost; Hyperparameter Tuning; SMOTE

Full Text:

PDF


Refbacks

  • There are currently no refbacks.



Barcode

Journal of Applied Data Sciences

ISSN : 2723-6471 (Online)
Organized by : Computer Science and Systems Information Technology, King Abdulaziz University, Kingdom of Saudi Arabia.
Website : http://bright-journal.org/JADS
Email : taqwa@amikompurwokerto.ac.id (principal contact)
    support@bright-journal.org (technical issues)

 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0