Qualitative data analysis of disaster risk reduction suggestions assisted by topic modeling and word2vec

被引：0

作者：

Gorro, Ken ^{[1
]}

Ancheta, Jeffrey Rosario ^{[2
]}

Capao, Kris ^{[1
]}

Oco, Nathaniel ^{[2
]}

Roxas, Rachel Edita ^{[2
]}

Sabellano, Mary Jane ^{[1
]}

Nonnecke, Brandie ^{[3
]}

Mohanty, Shrestha ^{[3
]}

Crittenden, Camille ^{[3
]}

Goldberg, Ken ^{[3
]}

机构：

[1] Univ San Carlos, Cebu, Philippines

[2] Natl Univ, Manila, Philippines

[3] Univ Calif Berkeley, Berkeley, CA 94720 USA

来源：

2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP) | 2017年

关键词：

word embedding; biterm topic modeling; gensim; scikit learn; Malasakit toolkit; disaster risk reduction;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this study, we examine suggestions for disaster risk reduction strategies provided by residents in selected disaster-prone areas in the Philippines. The study utilizes 976 suggestions on how their barangay can help them better prepare for a disaster. These were collected through Malasakit, an e-participation platform designed by University of California, Berkeley and National University (Philippines) to engage community participation in gathering qualitative and quantitative data. Analyses were conducted through biterm topic modeling (BTM) and word embedding using gensim. For better accuracy, data preprocessing was performed to remove irrelevant or noisy data. Based on the BTM result, we identified the following important codes: preparedness, disaster, awareness, community, help, seminars, kanal (canal), linisin. (clean), drainage, garbage, basura (garbage). Analyses of the topic models show that disaster preparedness is an integral part in disaster risk reduction by improving solid waste management, providing seminars for public awareness and evacuation preparation. A word intrusion test was conducted where BTM scored 55.71% which implies strong cohesion of the words with their topics. For word embedding, we drilled down on the following words: community, preparedness, emergency, barangay (village), help, kanal (drainage), basura (garbage), awareness, seminars, information. The word2vec results has a cosine similarity score of 0.902 which implies strong relatedness of each word. The result shows that the participants give importance to community preparedness for emergency, helping the barangay in clean-up drive, and awareness through seminars and information dissemination.

引用

页码：293 / 297

页数：5

共 50 条

[31] Sentiment Analysis of Twitter Messages using Word2vec by Weighted Average
Djaballah, Kamel Ahsene
Boukhalfa, Kamel
Boussaid, Omar
2019 SIXTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORKS ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), 2019, : 223 - 228
[32] Sentiment Analysis Based on Weighted Word2vec and Att-LSTM
Yuan, Huanhuan
Wang, Yongli
Feng, Xia
Sun, Shurong
PROCEEDINGS OF 2018 THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (CSAI 2018) / 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY (ICIMT 2018), 2018, : 420 - 424
[33] Sentiment Analysis of Bengali Comments With Word2Vec and Sentiment Information of Words
Al-Amin, Md.
Islam, Md. Saiful
Das Uzzal, Shapan
2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION ENGINEERING (ECCE), 2017, : 186 - 190
[34] Discovering opioid slang on social media: A Word2Vec approach with reddit data
Holbrook, E.
Wiskur, B.
Nagykaldi, Z.
DRUG AND ALCOHOL DEPENDENCE REPORTS, 2024, 13
[35] Evaluation of rule-based, CountVectorizer, and Word2Vec machine learning models for tweet analysis to improve disaster relief
Goyal, Radhika
2021 IEEE GLOBAL HUMANITARIAN TECHNOLOGY CONFERENCE (GHTC), 2021, : 16 - 19
[36] Online Unstructured Data Analysis Models with KoBERT and Word2vec: A Study on Sentiment Analysis of Public Opinion in Korean
Baek, Changwon
Kang, Jiho
Choi, Sangsoo
INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS, 2023, 23 (03) : 244 - 258
[37] Understanding the topic evolution of scientific literatures like an evolving city: Using Google Word2Vec model and spatial autocorrelation analysis
Hu Kai
Luo Qing
Qi Kunlun
Yang Siluo
Mao Jin
Fu Xiaokang
Zheng Jie
Wu Huayi
Guo Ya
Zhu Qibing
INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (04) : 1185 - 1203
[38] An Efficient Method for Document Categorization Based on Word2vec and Latent Semantic Analysis
Ju, Ronghui
Zhou, Pan
Li, Cheng Hua
Liu, Lijun
CIT/IUCC/DASC/PICOM 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY - UBIQUITOUS COMPUTING AND COMMUNICATIONS - DEPENDABLE, AUTONOMIC AND SECURE COMPUTING - PERVASIVE INTELLIGENCE AND COMPUTING, 2015, : 2280 - 2287
[39] A deep learning analysis on question classification task using Word2vec representations
Yilmaz, Seyhmus
Toklu, Sinan
NEURAL COMPUTING & APPLICATIONS, 2020, 32 (07): : 2909 - 2928
[40] XGBRS Framework Integrated with Word2Vec Sentiment Analysis for Augmented Drug Recommendation
Paliwal, Shweta
Mishra, Amit Kumar
Mishra, Ram Krishn
Nawaz, Nishad
Senthilkumar, M.
CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 72 (03): : 5345 - 5362

← 1 2 3 4 5 →