Sentiment Analysis on Indonesian Stock Market Texts: A Comparative Study of Support Vector Machine (SVM) and IndoBERT
Informasi
JurnalProceedings of 2025 IEEE International Conference on Data and Software Engineering, ICoDSE 2025
PenerbitInstitute of Electrical and Electronics Engineers Inc.
Halaman455 - 460
Tahun Publikasi2025
ISBN979-833157578-6
Jenis SumberScopus
Abstrak
Social media platforms have become key sources of real-time investor sentiment, influencing market movements in Indonesia's stock market. However, accurate sentiment extraction in Bahasa Indonesia remains challenging due to informal language and evolving financial terminology. This paper presents a comparative study of sentiment analysis models applied to Indonesian stock market tweets. We evaluate a traditional SVM model using TF-IDF features against two variants of the transformer-based IndoBERT model (base and large) on the ID-SMSA dataset. IndoBERT-base achieves the highest accuracy (97.72%), significantly outperforming SVM (90.22 %). A novel Out-of-Vocabulary (OOV) evaluation systematically quantifies vocabulary coverage impacts on model performance, revealing that IndoBERT's sub word tokenization offers superior robustness to domain-specific vocabulary compared to TF-IDF-based approaches. These results demonstrate the effectiveness of con-textual language models in financial sentiment classification tasks. © 2025 IEEE.
Dokumen & Tautan
