I JMIR Publications %V 10 %N 8 %P 3卡塔尔世界杯8强波胆分析7862 %T计算健康通信搜索词识别方法:YouTube上健康内容的词嵌入和网络方法%A Tong,Chau %A Margolin,Drew %A Chunara,Rumi %A Niederdeppe,Jeff %A Taylor,Teairah %A Dunbar,Natalie %A King,Andy J %+传播系,康奈尔大学,494曼恩图书馆,纽约州伊萨卡,14850,美国,1 608 334 9909,ctt39@cornell.edu %K健康信息检索%K搜索词识别%K社交媒体%K健康通信%K公共健康%K计算文本分析%K自然语言处理%K NLP %K word2vec %K词嵌入%K网络分析%D 2022 %7 30.8.2022 %9原创论文%J JMIR Med Inform %G英语%X背景:在卫生传播研究中提取内容的常用方法通常涉及使用一组完善的查询,通常是医疗程序或疾病的名称,这些查询通常是技术性的,或很少在公共健康话题讨论中使用。尽管这些方法能产生较高的回忆率(即检索高度相关的内容),但它们往往会忽略社交媒体上使用口语化语言和外行人词汇的健康信息。考虑到这些信息可能包含错误信息或绕开官方医学概念的模糊内容,正确识别(和分析)它们对于研究社交媒体平台上用户生成的健康内容至关重要。目的:健康传播学者将受益于超越使用标准术语作为搜索查询的检索过程。基于此,本研究旨在提出一种搜索词识别方法,以改进社交媒体上用户生成健康内容的检索。我们把癌症筛查测试作为主题,把YouTube作为平台案例研究。方法:我们检索了使用癌症筛查程序(结肠镜检查、粪便潜血检查、乳房x光检查和巴氏试验)作为种子查询的YouTube视频。 We then trained word embedding models using text features from these videos to identify the nearest neighbor terms that are semantically similar to cancer screening tests in colloquial language. Retrieving more YouTube videos from the top neighbor terms, we coded a sample of 150 random videos from each term for relevance. We then used text mining to examine the new content retrieved from these videos and network analysis to inspect the relations between the newly retrieved videos and videos from the seed queries. Results: The top terms with semantic similarities to cancer screening tests were identified via word embedding models. Text mining analysis showed that the 5 nearest neighbor terms retrieved content that was novel and contextually diverse, beyond the content retrieved from cancer screening concepts alone. Results from network analysis showed that the newly retrieved videos had at least one total degree of connection (sum of indegree and outdegree) with seed videos according to YouTube relatedness measures. Conclusions: We demonstrated a retrieval technique to improve recall and minimize precision loss, which can be extended to various health topics on YouTube, a popular video-sharing social media platform. We discussed how health communication scholars can apply the technique to inspect the performance of the retrieval strategy before investing human coding resources and outlined suggestions on how such a technique can be extended to other health contexts. %M 36040760 %R 10.2196/37862 %U https://medinform.www.mybigtv.com/2022/8/e37862 %U https://doi.org/10.2196/37862 %U http://www.ncbi.nlm.nih.gov/pubmed/36040760