@文章{信息:doi/10.2196/40102,作者=“Meaney, Christopher和Escobar, Michael和Stukel, Therese A和Austin, Peter C和Jaakkimainen, Liisa”,标题=“从初级保健临床文本数据估计时间主题模型的方法比较:回顾性封闭队列研究”,期刊=“JMIR Med Inform”,年=“2022”,月=“12”,日=“19”,卷=“10”,数=“12”,页=“e40102”,关键词=“临床文本数据;时态主题模型;非负矩阵分解;潜狄利克雷分配;结构主题模型;BERTopic;背景:卫生保健组织正在收集越来越多的临床文本数据。主题模型是一类无监督的机器学习算法,用于发现这些大型非结构化文档集合中的潜在主题模式。目的:我们旨在比较评估几种估计时间主题模型的方法,使用从加拿大安大略省初级保健电子病历中获得的临床记录。方法:我们采用回顾性封闭队列设计。 The study spanned from January 01, 2011, through December 31, 2015, discretized into 20 quarterly periods. Patients were included in the study if they generated at least 1 primary care clinical note in each of the 20 quarterly periods. These patients represented a unique cohort of individuals engaging in high-frequency use of the primary care system. The following temporal topic modeling algorithms were fitted to the clinical note corpus: nonnegative matrix factorization, latent Dirichlet allocation, the structural topic model, and the BERTopic model. Results: Temporal topic models consistently identified latent topical patterns in the clinical note corpus. The learned topical bases identified meaningful activities conducted by the primary health care system. Latent topics displaying near-constant temporal dynamics were consistently estimated across models (eg, pain, hypertension, diabetes, sleep, mood, anxiety, and depression). Several topics displayed predictable seasonal patterns over the study period (eg, respiratory disease and influenza immunization programs). Conclusions: Nonnegative matrix factorization, latent Dirichlet allocation, structural topic model, and BERTopic are based on different underlying statistical frameworks (eg, linear algebra and optimization, Bayesian graphical models, and neural embeddings), require tuning unique hyperparameters (optimizers, priors, etc), and have distinct computational requirements (data structures, computational hardware, etc). Despite the heterogeneity in statistical methodology, the learned latent topical summarizations and their temporal evolution over the study period were consistently estimated. Temporal topic models represent an interesting class of models for characterizing and monitoring the primary health care system. ", issn="2291-9694", doi="10.2196/40102", url="https://medinform.www.mybigtv.com/2022/12/e40102", url="https://doi.org/10.2196/40102", url="http://www.ncbi.nlm.nih.gov/pubmed/36534443" }
Baidu
map