Classification of Honorific Levels in Javanese: Comparison Between Rule-Based, Classical Machine Learning and Transformer-Based Methods

Penulis: Amin, Iqbal Pahlevi; Yuliawati, Arlisa; Alfina, Ika

Informasi

JurnalProceedings of 2025 International Conference on Asian Language Processing, IALP 2025, 2025 International Conference on Asian Language Processing (IALP)

PenerbitInstitute of Electrical and Electronics Engineers Inc., IEEE

Halaman170 - 175

Tahun Publikasi2025

DOI10.1109/IALP68296.2024.11156590

ISBN979-833158979-0

Jenis SumberScopus

Abstrak

We study the classification of honorific levels in Javanese, a regional language in Indonesia. Generally, The honorific levels are divided into N goko (the casual nuance) and Krama (the formal or polite nuance). We compare the performance of rule-based, classical machine learning approaches (Logistic Regression, Gaussian Naive Bayes, SVM, Random Forest, CatBoost) and Transformer-based methods (BERT with Word2Vec/fastText word embeddings). We also built a new dataset of 979 sentences that were manually annotated. Unfortunately, this dataset exhibits severe class imbalance (1:10.125 for Ngoko: Krama ratio), which we addressed through oversampling techniques, SMOTE and Polynom-fit-SMOTE. The experiment results show that BERT achieved the highest performance with 98.40 % precision, 93.36 % recall, and 95.30 % F1-score, significantly outperforming other methods. This work demonstrates the effectiveness of language-specific pretraining for capturing Javanese honorific-level nuances. © 2025 IEEE.

Dokumen & Tautan

Scopus Google Scholar