基于嵌入技术的结构化电子病历患者表征卡塔尔世界杯8强波胆分析开发与验证研究%黄a,王彦群%,倪a,张志强%,刘志强%,刘红蕾%,菲红蕾%,魏晓璐%,陈岚%,陈慧%+首都医科大学生物医学工程学院,北京市丰台区右门外西头条10号,邮编:100069,86 1083911545,chenhui@ccmu.edu.cn %K电子病历%K跳跃图%K特征表示%K患者表示%K卒中%D 2021 %7 23.7.2021 %9原始论文%J JMIR Med Inform %G英文%X由于数据表示的多样性、稀疏性和高维性,结构化电子病历(sEMR)数据的二次使用已经成为一个挑战。构建有效的sEMR数据表示对于后续的数据应用变得越来越重要。目的:将自然语言处理领域的嵌入技术应用于sEMR数据表示,探讨基于嵌入的特征和患者表示在临床应用中的可行性和优越性。方法:整个训练语料库包括104752例住院患者病历,13757个疾病诊断、体格检查和程序、实验室检查、用药等医学概念。使用Skip-gram算法将每个医学概念嵌入到200维实数向量中,并从记录的20次洗刷医学概念中进行一些自适应更改。患者记录中所有医学概念的向量的平均值代表患者。对于基于嵌入的特征表示评估,我们使用医学概念向量之间的余弦相似性来捕获医学概念之间潜在的临床关联。我们进一步对脑卒中患者进行了聚类分析,以评估和比较基于嵌入的患者表征。 The Hopkins statistic, Silhouette index (SI), and Davies-Bouldin index were used for the unsupervised evaluation, and the precision, recall, and F1 score were used for the supervised evaluation. Results: The dimension of patient representation was reduced from 13,757 to 200 using the embedding-based representation. The average cosine similarity of the selected disease (subarachnoid hemorrhage) and its 15 clinically relevant medical concepts was 0.973. Stroke patients were clustered into two clusters with the highest SI (0.852). Clustering analyses conducted on patients with the embedding representations showed higher applicability (Hopkins statistic 0.931), higher aggregation (SI 0.862), and lower dispersion (Davies-Bouldin index 0.551) than those conducted on patients with reference representation methods. The clustering solutions for patients with the embedding-based representation achieved the highest F1 scores of 0.944 and 0.717 for two clusters. Conclusions: The feature-level embedding-based representations can reflect the potential clinical associations among medical concepts effectively. The patient-level embedding-based representation is easy to use as continuous input to standard machine learning algorithms and can bring performance improvements. It is expected that the embedding-based representation will be helpful in a wide range of secondary uses of sEMR data. %M 34297000 %R 10.2196/19905 %U https://medinform.www.mybigtv.com/2021/7/e19905 %U https://doi.org/10.2196/19905 %U http://www.ncbi.nlm.nih.gov/pubmed/34297000
Baidu
map