Determining Important Features for Dengue Diagnosis using Feature Selection Methods

Yulianti Paula Bria, Paskalis Andrianus Nani, Yovinia Carmeneja Hoar Siki, Natalia Magdalena Rafu Mamulak, Emiliana Metan Meolbatak, Robertus Dole Guntur

Abstract


This research aims to determine the important features including symptoms and risk factors for dengue diagnosis. This study’s dataset was obtained from medical records collected from two hospitals in Indonesia from patients with dengue and nondengue diseases. Four feature selection methods including feature importance, recursive feature elimination, correlation matrix and KBest were leveraged to determine significant features. Feature importance employed a tree-based classifier to derive the importance scores of the features. Recursive feature elimination employed a machine learning classifier to choose the most important features from the given dataset. Correlation matrix was employed to select the best features because it has the ability to use the correlation between each feature with the target. Univariate feature selection – Kbest has the ability to choose the best features based on univariate statistical tests. Important features were also gathered from fifteen Indonesian medical doctors to confirm the results. We used six machine learning techniques for dengue prediction. The random forest classifier yields the highest accuracy for the best combination of features with the accuracy of 0.93 (LR: 0.90 (0.04), KNN: 0.89 (0.04), XGBoost: 0.91 (0.03), RF: 0.93 (0.04), NB: 0.88 (0.09), SVM: 0.89 (0.04)) and precision of 0.90 (LR: 0.86 (0.22), KNN: 0.67 (0.14), XGBoost: 0.77 (0.13), RF: 0.90 (0.13), NB: 0.66 (0.20), SVM: 0.66 (0.18)). This study shows the significant features for dengue diagnosis including fever, fever duration, headache, muscle and joint pain, nausea, vomiting, abdominal pain, shivering, malaise, loss of appetite, shortness of breath, rash, bleeding nose, bitter mouth, temperature and age. This knowledge is pivotal to educate society to seek medical advice when dengue symptoms appear to avoid severe conditions. Arthralgia/joint pain and myalgia/muscle pain are the most significant features for the dengue prediction. This knowledge is important for medical doctors as a starting point for clinical dengue diagnosis.


Article Metrics

Abstract: 151 Viewers PDF: 84 Viewers

Keywords


Dengue Fever; Dengue Diagnosis; Feature Selection; Machine Learning

Full Text:

PDF


Refbacks

  • There are currently no refbacks.



Barcode

Journal of Applied Data Sciences

ISSN : 2723-6471 (Online)
Organized by : Computer Science and Systems Information Technology, King Abdulaziz University, Kingdom of Saudi Arabia.
Website : http://bright-journal.org/JADS
Email : taqwa@amikompurwokerto.ac.id (principal contact)
    support@bright-journal.org (technical issues)

 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0