Sentiment Analysis on Slang Enriched Texts Using Machine Learning Approaches
Abstract
This study explores sentiment analysis of slang-enriched user reviews using machine learning techniques, specifically Naive Bayes, Support Vector Machine (SVM), and Random Forest, to classify user sentiment into Positive, Negative, and Neutral categories while addressing challenges posed by informal and conversational language through slang normalization. A lexicon-based scoring method was employed to standardize slang terms such as “gak,” “aja,” and “banget,” ensuring consistency in sentiment analysis. The results indicate that Neutral sentiment dominates the dataset (51%), followed by Negative (28%) and Positive (21%), with lexicon-based scores confirming this distribution. Negative sentiment exhibits a broader intensity range, reflecting user dissatisfaction primarily related to network quality, service reliability, and pricing, as evident from recurring terms like “sinyal” (signal), “jaringan” (network), and “mahal” (expensive). Word cloud visualizations reinforce these findings, highlighting the prevalence of these concerns in user feedback. Performance evaluation of the machine learning models reveals that SVM and Random Forest achieved the highest accuracy (96%), significantly outperforming Naive Bayes (73%), demonstrating their effectiveness in handling high-dimensional text data and accurately classifying slang-rich content. These findings underscore the importance of slang normalization in preprocessing, as it significantly enhances sentiment classification accuracy. This study provides actionable insights for service providers, helping them identify and address key sources of user dissatisfaction. Future research can explore deep learning models such as BERT and LSTM to further enhance sentiment analysis by capturing contextual relationships within text data, while topic modeling techniques could uncover deeper thematic patterns in user feedback, enabling data-driven strategies to improve customer satisfaction.
Article Metrics
Abstract: 25 Viewers PDF: 21 ViewersKeywords
Sentiment Analysis; Slang Normalization; Lexicon-Based Scoring; Machine Learning Models; Support Vector Machine (SVM); Random Forest; User Reviews Analysis
Full Text:
PDFRefbacks
- There are currently no refbacks.
Journal of Applied Data Sciences
ISSN | : | 2723-6471 (Online) |
Organized by | : | Computer Science and Systems Information Technology, King Abdulaziz University, Kingdom of Saudi Arabia. |
Website | : | http://bright-journal.org/JADS |
: | taqwa@amikompurwokerto.ac.id (principal contact) | |
support@bright-journal.org (technical issues) |
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0