TY - JOUR AU - Caskey, John AU - McConnell, Iain L AU - Oguss, Madeline AU - Dligach, Dmitriy AU - Kulikoff, Rachel AU - Grogan, Brittany AU - Gibson, Crystal AU - Wimmer, Elizabeth AU - DeSalvo, Traci E AU - nyakee - nyasani, Edwin E AU - Churpek, Matthew M AU - Afshar, Majid PY - 2022 DA - 2022/3/8 TI -从公共卫生部门接触者追踪采访表中识别COVID-19爆发:自然语言处理管道的开发JO - JMIR公共卫生监测SP - e36119 VL - 8 IS - 3kw -自然语言处理KW -公共卫生信息学KW -命名实体识别KW -接触者追踪KW - COVID-19 KW -爆发KW -神经语言模型KW -疾病监测KW -数字健康KW -电子监测KW -公共卫生KW -数字监测工具AB -背景:在威斯康星州,COVID-19病例访谈表格包含自由文本字段,需要对这些字段进行挖掘,以确定潜在的疫情,以便制定有针对性的政策。我们开发了一个自动管道,将免费文本输入到一个预训练的神经语言模型中,以识别企业和设施。目的:我们的目的是检查我们的自然语言处理管道针对现有的爆发和潜在的新集群的准确性和召回率。方法:从2020年7月1日至2021年6月30日期间戴恩县的威斯康星州电子疾病监测系统(WEDSS)中提取COVID-19病例数据。来自案例访谈表单的特征被输入到变形金刚的双向编码器表示(BERT)模型中,该模型经过了命名实体识别(NER)的微调。我们还开发了一种新的位置映射工具,为相关的NER提供地址。精确度和召回率是根据WEDSS中手动验证的爆发和有效地址进行测量的。结果:共有46798例COVID-19, BERT代币总数为4183273个,唯一代币15051个。 The recall and precision of the NER tool were 0.67 (95% CI 0.66-0.68) and 0.55 (95% CI 0.54-0.57), respectively. For the location-mapping tool, the recall and precision were 0.93 (95% CI 0.92-0.95) and 0.93 (95% CI 0.92-0.95), respectively. Across monthly intervals, the NER tool identified more potential clusters than were verified in WEDSS. Conclusions: We developed a novel pipeline of tools that identified existing outbreaks and novel clusters with associated addresses. Our pipeline ingests data from a statewide database and may be deployed to assist local health departments for targeted interventions. SN - 2369-2960 UR - https://publichealth.www.mybigtv.com/2022/3/e36119 UR - https://doi.org/10.2196/36119 UR - http://www.ncbi.nlm.nih.gov/pubmed/35144241 DO - 10.2196/36119 ID - info:doi/10.2196/36119 ER -
Baidu
map