TY -的盟Cole-Lewis Heather盟——Varghese阿伦AU -桑德斯,艾米AU -施瓦兹,玛丽盟——Pugatch吉莉安盟——Augustson Erik PY - 2015 DA - 2015/08/25 TI -评估电子香烟及其微博人气和内容使用监督机器学习乔- J地中海互联网Res SP - e208六世- 17 - 8 KW -社会媒体千瓦Twitter KW -烟KW -机器学习AB -背景:电子烟在社交媒体用户中继续成为一个日益增长的话题,尤其是在推特上。实时分析关于电子烟的对话的能力可以为了解公众对电子烟的知识、态度和信念的趋势提供重要的见解,并随后指导公共卫生干预措施。目的:我们的目标是建立一个有监督的机器学习算法,以建立预测分类模型,评估Twitter数据中与电子烟相关的一系列因素。方法:对17098条推文进行人工内容分析。这些推文分为五个类别:电子烟相关性、情感、用户描述、类型和主题。然后为这五个类别中的每个类别建立机器学习分类模型,并使用词分组(n-grams)来定义每个分类器的特征空间。结果:分类模型的预测性能得分表明,模型在68.40%到99.34%的时间内正确地标记了带有适当变量的推文,分类模型在随机基线上实现的最大可能改进的百分比在41.59%到80.62%之间。与随机基线相比,具有最高绩效分数且实现最大可能改进的百分比最高的分类器是政策/政府(绩效:0.94;%改进:80.62%),相关性(性能:0.94; % improvement: 75.26%), Ad or Promotion (performance: 0.89; % improvement: 72.69%), and Marketing (performance: 0.91; % improvement: 72.56%). The most appropriate word-grouping unit (n-gram) was 1 for the majority of classifiers. Performance continued to marginally increase with the size of the training dataset of manually annotated data, but eventually leveled off. Even at low dataset sizes of 4000 observations, performance characteristics were fairly sound. Conclusions: Social media outlets like Twitter can uncover real-time snapshots of personal sentiment, knowledge, attitudes, and behavior that are not as accessible, at this scale, through any other offline platform. Using the vast data available through social media presents an opportunity for social science and public health methodologies to utilize computational methodologies to enhance and extend research and practice. This study was successful in automating a complex five-category manual content analysis of e-cigarette-related content on Twitter using machine learning techniques. The study details machine learning model specifications that provided the best accuracy for data related to e-cigarettes, as well as a replicable methodology to allow extension of these methods to additional topics. SN - 1438-8871 UR - //www.mybigtv.com/2015/8/e208/ UR - https://doi.org/10.2196/jmir.4392 UR - http://www.ncbi.nlm.nih.gov/pubmed/26307512 DO - 10.2196/jmir.4392 ID - info:doi/10.2196/jmir.4392 ER -
Baidu
map