TY - JOUR AU - Horne, Elsie AU - Tibble, Holly AU - Sheikh, Aziz AU - Tsanas, Athanasios PY - 2020 DA - 20/5/28 TI -聚类多模态临床数据的挑战:哮喘亚型JO的应用综述- JMIR Med Inform SP - e16452 VL - 8 IS - 5 KW -哮喘KW -聚类分析KW -数据挖掘KW -机器学习KW -无监督机器学习AB -背景:在当前的个性化医疗时代,人们对了解疾病人群的异质性越来越感兴趣。聚类分析是一种在异质疾病人群中识别亚型的常用方法。在这种应用中使用的临床数据通常是多模态的,这可能会使传统的聚类分析方法的应用具有挑战性。目的:综述应用多模式临床数据聚类识别哮喘亚型的研究文献。我们评估了在确定哮喘亚型时应用聚类分析方法的常见问题和不足之处,以便能够引起研究界的注意并在未来的研究中避免。方法:我们检索PubMed和Scopus文献数据库中与聚类分析和哮喘相关的术语,以确定应用基于差异的聚类分析方法的研究。我们在聚类分析过程的每个步骤中记录了每个研究中使用的分析方法。结果:我们的文献检索确定了63项将聚类分析应用于多模态临床数据以确定哮喘亚型的研究。馈送到聚类算法的特征在47项(75%)研究中是混合类型,在12项(19%)研究中是连续类型,其余4项(6%)研究中特征类型不明确。 A total of 23 (37%) studies used hierarchical clustering with Ward linkage, and 22 (35%) studies used k-means clustering. Of these 45 studies, 39 had mixed-type features, but only 5 specified dissimilarity measures that could handle mixed-type features. A further 9 (14%) studies used a preclustering step to create small clusters to feed on a hierarchical method. The original sample sizes in these 9 studies ranged from 84 to 349. The remaining studies used hierarchical clustering with other linkages (n=3), medoid-based methods (n=3), spectral clustering (n=1), and multiple kernel k-means clustering (n=1), and in 1 study, the methods were unclear. Of 63 studies, 54 (86%) explained the methods used to determine the number of clusters, 24 (38%) studies tested the quality of their cluster solution, and 11 (17%) studies tested the stability of their solution. Reporting of the cluster analysis was generally poor in terms of the methods employed and their justification. Conclusions: This review highlights common issues in the application of cluster analysis to multimodal clinical data to identify asthma subtypes. Some of these issues were related to the multimodal nature of the data, but many were more general issues in the application of cluster analysis. Although cluster analysis may be a useful tool for investigating disease subtypes, we recommend that future studies carefully consider the implications of clustering multimodal data, the cluster analysis process itself, and the reporting of methods to facilitate replication and interpretation of findings. SN - 2291-9694 UR - http://medinform.www.mybigtv.com/2020/5/e16452/ UR - https://doi.org/10.2196/16452 UR - http://www.ncbi.nlm.nih.gov/pubmed/32463370 DO - 10.2196/16452 ID - info:doi/10.2196/16452 ER -
Baidu
map