TY - JOUR AU - Pal, Ridam AU - Chopra, Harshita AU - Awasthi, Raghav AU - Bandhey, Harsh AU - Nagori, Aditya AU - Sethi, Tavpritesh PY - 2022 DA - 2022/11/2 TI -基于无监督词嵌入和机器学习的快速扩展COVID-19文献中的新兴主题预测:基于证据的研究JO - J Med Internet Res SP - e34067 VL - 24 IS - 11 KW - COVID-19 KW -命名实体识别KW -无监督词嵌入KW -机器学习KW -自然语言预处理AB -背景:来自同行评审文献的证据是设计应对COVID-19等全球威胁的基础。在大量和快速增长的语料库中,例如COVID-19出版物,吸收和综合信息具有挑战性。利用一个健壮的计算管道来评估多个方面,比如网络拓扑特征、社区及其时间趋势,可以使这个过程更有效。目的:我们的目的是表明新的知识可以被捕获和跟踪使用的时间变化在底层的无监督词嵌入的文献。使用机器学习对单词之间不断发展的关联进行预测,可以进一步预测即将到来的主题。方法:自2020年2月起,从世界卫生组织数据库中每月收集的15万多篇COVID-19文章摘要中提取出现频率高的医疗实体。在每个月的文献上训练的词嵌入被用来构建以余弦相似度作为边缘权重的实体网络。根据先前的模式预测下一个月网络的拓扑特征,并使用监督机器学习预测新的链接。社区检测和冲积图用于跟踪几个月来演变的生物医学主题。 Results: We found that thromboembolic complications were detected as an emerging theme as early as August 2020. A shift toward the symptoms of long COVID complications was observed during March 2021, and neurological complications gained significance in June 2021. A prospective validation of the link prediction models achieved an area under the receiver operating characteristic curve of 0.87. Predictive modeling revealed predisposing conditions, symptoms, cross-infection, and neurological complications as dominant research themes in COVID-19 publications based on the patterns observed in previous months. Conclusions: Machine learning–based prediction of emerging links can contribute toward steering research by capturing themes represented by groups of medical entities, based on patterns of semantic relationships over time. SN - 1438-8871 UR - //www.mybigtv.com/2022/11/e34067 UR - https://doi.org/10.2196/34067 UR - http://www.ncbi.nlm.nih.gov/pubmed/36040993 DO - 10.2196/34067 ID - info:doi/10.2196/34067 ER -
Baidu
map