中心研究成果:一种改进的深度学习网络舆情分析框架——以疫苗接种话题为例

2022-10-20 管理员

研究成果:Revealing Public Opinion towards the COVID-19 Vaccine with Weibo Data in China: BertFDA-Based Model作者:朱建平,翁福添,庄穆妮,吕欣,谭旭,林松杰,张若煜

发表期刊:International Journal of Environmental Research and Public Health,2022,19(20),13248,JCR Q1.

原文链接:https://www.mdpi.com/1660-4601/19/20/13248

近日,厦门大学健康医疗大数据国家研究院、医学院、管理学院、数据挖掘研究中心团队,联合国防科技大学系统工程学院、美国华盛顿大学等团队,在期刊International Journal of Environmental Research and Public Health线上刊出题为“Revealing Public Opinion towards the COVID-19 Vaccine with Weibo Data in China: BertFDA-Based Model”的论文。



新冠疫情给人们的健康和主观幸福带来了前所未有的负担。虽然国内外学者建立了各种模型来跟踪和预测新冠疫情的影响状态,但聚焦公众讨论的主题和疫苗的情绪演变,特别是疫苗支持者和疫苗犹豫者之间关注主题的差异仍然很少。本文利用新冠肺炎爆发后两年的社交媒体数据,结合最先进的自然语言处理技术,开发了一个新的舆情分析框架(BertFDA)。首先,通过潜在Dirichlet分配(LDA)模型在微博上进行动态主题聚类,将24个月内的2,211,806条微博帖子生成118个主题;其次,通过建立一个改进的Bert情绪分类预训练模型,我们提供了公众的负面情绪在新冠疫苗接种早期阶段持续下降的证据。第三,通过对疫苗支持者和疫苗犹豫者的微博帖子进行建模和分析,我们发现疫苗支持者更关注疫苗有效性和新闻的客观报道,反映出更大的群体凝聚力,而疫苗犹豫者尤其关注冠状病毒变体的传播和疫苗副作用,并捕捉其差异性特征。最后,我们开发了不同的机器学习模型来预测舆情发展。此外,本文首次提出通过函数型数据分析(FDA)来构建函数情感曲线,该曲线可以有效地捕捉显式函数的情感动态变化。这项研究可以帮助政府制定有效的干预措施和教育活动,以提高疫苗接种率。该工作得到国家社会科学基金重大项目(20&ZD137)支持。论文内容请查看本文底部“阅读原文”。

原文摘要

The COVID-19 pandemic has created unprecedented burdens on people’s health and subjective well-being. While countries around the world have established models to track and predict the affective states of COVID-19, identifying the topics of public discussion and sentiment evolution of the vaccine, particularly the differences in topics of concern between vaccine-support and vaccine-hesitant groups, remains scarce. Using social media data from the two years following the outbreak of COVID-19 (23 January 2020 to 23 January 2022), coupled with state-of-the-art natural language processing (NLP) techniques, we developed a public opinion analysis framework (BertFDA). First, using dynamic topic clustering on Weibo through the latent Dirichlet allocation (LDA) model, a total of 118 topics were generated in 24 months using 2,211,806 microblog posts. Second, by building an improved Bert pre-training model for sentiment classification, we provide evidence that public negative sentiment continued to decline in the early stages of COVID-19 vaccination. Third, by modeling and analyzing the microblog posts from the vaccine-support group and the vaccine-hesitant group, we discover that the vaccine-support group was more concerned about vaccine effectiveness and the reporting of news, reflecting greater group cohesion, whereas the vaccine-hesitant group was particularly concerned about the spread of coronavirus variants and vaccine side effects. Finally, we deployed different machine learning models to predict public opinion. Moreover, functional data analysis (FDA) is developed to build the functional sentiment curve, which can effectively capture the dynamic changes with the explicit function. This study can aid governments in developing effective interventions and education campaigns to boost vaccination rates.

Keywords: COVID-19 vaccine; sentiment analysis; topic model; Bert; functional data analysis



责任编辑| 郑陈璐
图文编辑| 翁福添 庄穆妮
排版编辑| 马茂淇 吴小龙

厦门大学数据挖掘研究中心
2022年10月20日


官方网站:http://xdmrc.org/
新浪微博:厦门大学数据挖掘研究中心