基于自监督方法的疾病概念嵌入电子病历医疗信息提取与疾病检索卡塔尔世界杯8强波胆分析算法开发与验证研究[陈a,罗艳萍,李元勋,黄飞培,建华]+台湾台北市中正区中山南路7号,国立台湾大学医院急诊科,886 2 2312 3456chhuang5940@ntu.edu.tw %K电子健康记录%K EHR %K疾病嵌入%K疾病检索%K急诊科%K概念%K提取%K深度学习%K机器学习%K自然语言处理%K NLP %D 2021 %7 27.1.2021 %9原文%J J Med Internet Res %G English %X背景:电子健康记录(EHR)包含了丰富的医疗信息。一个有组织的电子病历可以极大地帮助医生治疗病人。在某些情况下,仅收集有限的患者信息来帮助医生做出治疗决定。因为电子病历可以作为这些有限信息的参考,从而提高医生的治疗能力。自然语言处理和深度学习方法可以帮助组织电子病历信息并将其转化为医学知识和经验。目的:在本研究中,我们旨在建立一个从电子病历中提取概念嵌入的模型,用于疾病模式检索和进一步分类任务。方法:我们收集国立台湾大学医院综合医学数据库中的1,040,989例急诊科就诊病例,以及国立医院与门诊医疗调查急诊科数据中的305,897例样本。经过数据清理和预处理后,将数据集分为训练集、验证集和测试集。 We proposed a Transformer-based model to embed EHRs and used Bidirectional Encoder Representations from Transformers (BERT) to extract features from free text and concatenate features with structural data as input to our proposed model. Then, Deep InfoMax (DIM) and Simple Contrastive Learning of Visual Representations (SimCLR) were used for the unsupervised embedding of the disease concept. The pretrained disease concept-embedding model, named EDisease, was further finetuned to adapt to the critical care outcome prediction task. We evaluated the performance of embedding using t-distributed stochastic neighbor embedding (t-SNE) to perform dimension reduction for visualization. The performance of the finetuned predictive model was evaluated against published models using the area under the receiver operating characteristic (AUROC). Results: The performance of our model on the outcome prediction had the highest AUROC of 0.876. In the ablation study, the use of a smaller data set or fewer unsupervised methods for pretraining deteriorated the prediction performance. The AUROCs were 0.857, 0.870, and 0.868 for the model without pretraining, the model pretrained by only SimCLR, and the model pretrained by only DIM, respectively. On the smaller finetuning set, the AUROC was 0.815 for the proposed model. Conclusions: Through contrastive learning methods, disease concepts can be embedded meaningfully. Moreover, these methods can be used for disease retrieval tasks to enhance clinical practice capabilities. The disease concept model is also suitable as a pretrained model for subsequent prediction tasks. %M 33502324 %R 10.2196/25113 %U //www.mybigtv.com/2021/1/e25113/ %U https://doi.org/10.2196/25113 %U http://www.ncbi.nlm.nih.gov/pubmed/33502324
Baidu
map