Visual question answering for monas tourism object using deep learning

Penulis: Siregar, Ahmad Hasan; Chahyati, Dina
Informasi
Jurnal2020 International Conference on Advanced Computer Science and Information Systems, ICACSIS 2020
PenerbitInstitute of Electrical and Electronics Engineers Inc.
Halaman381 - 386
Tahun Publikasi2020
ISBN978-172819279-6
Jenis SumberScopus
Sitasi
Scopus: 2
Abstrak
Visual Question Answering (VQA) is a machine learning task, given a pair of image and natural language visual question about the image, the task is to answer the question. It is known that there is no public VQA dataset currently available in Bahasa Indonesia. This research compiles a Monas VQA dataset that uses Bahasa Indonesia in the question and Monas, a memorial monument for Indonesian, as the image specific context to resolve the problem. This research also proposes methods to solve VQA using CNN for image embedding, techniques from the NLP field for sentence embedding e.g. Bag-of-Words, fastText, BERT, and BiLSTM, lastly multimodal machine learning to let both embedded information to interact with each other. The best performing model achieves 68.39% accuracy with architecture impact analysis presented. © 2020 IEEE.
Dokumen & Tautan

© 2025 Universitas Indonesia. Seluruh hak cipta dilindungi.