TY - JOUR AU - Oloruntoba, Ayooluwatomiwa I AU - vesterggaard, Tine AU - Nguyen, Toan D AU - Yu, Zhen AU - Sashindranath, Maithili AU - Betz-Stablein, Brigid AU - Soyer, H Peter AU - Ge, Zongyuan AU - Mar, Victoria PY - 2022 DA - 2022/9/12 TI -标准化和非标准化图像训练的深度学习模型的泛化性及其对远程皮肤科医生的性能评估:回顾比较研究JO - JMIR Dermatol SP - e35150 VL - 5 IS - 3kw -人工智能KW - AI KW -卷积神经网络KW - CNN KW -远程皮肤科KW -标准化图像KW -非标准化图像KW -机器学习KW -皮肤癌KW -癌症AB -背景:卷积神经网络(CNN)是一种人工智能,有望作为皮肤癌的诊断辅助工具。然而,大多数是使用具有不同图像捕获标准化的回顾性图像数据集进行训练的。目的:我们研究的目的是使用具有相同架构的CNN模型-对使用相同图像捕获设备和技术(标准化)或使用不同设备和捕获技术(非标准化)获得的图像集进行训练-并测试在不同人群中分类皮肤癌图像时性能的可变性。方法:共训练3个具有相同架构的cnn。CNN非标准化(CNN- ns)使用不同的图像捕获设备对取自国际皮肤成像合作组织(ISIC)的25,331张图像进行了训练。CNN标准化(CNN- s)在使用相同的捕获设备拍摄的177,475张MoleMap图像上进行训练,CNN标准化2号(CNN- s2)在25,331张标准化MoleMap图像子集上进行训练(训练图像的数量和类别与CNN- ns相匹配)。然后在3个外部测试集上对这3个模型进行测试:569张丹麦图像,由33,126张图像组成的公开ISIC 2020数据集,以及由422张图像组成的昆士兰大学(UQ)数据集。主要结局指标为敏感性、特异性和受试者工作特征曲线下面积(AUROC)。用于丹麦数据集的远程皮肤科评估用于确定与远程皮肤科医生相比的模型性能。 Results: When tested on the 569 Danish images, CNN-S achieved an AUROC of 0.861 (95% CI 0.830-0.889) and CNN-S2 achieved an AUROC of 0.831 (95% CI 0.798-0.861; standardized models), with both outperforming CNN-NS (nonstandardized model; P=.001 and P=.009, respectively), which achieved an AUROC of 0.759 (95% CI 0.722-0.794). When tested on 2 additional data sets (ISIC 2020 and UQ), CNN-S (P<.001 and P<.001, respectively) and CNN-S2 (P=.08 and P=.35, respectively) still outperformed CNN-NS. When the CNNs were matched to the mean sensitivity and specificity of the teledermatologists on the Danish data set, the models’ resultant sensitivities and specificities were surpassed by the teledermatologists. However, when compared to CNN-S, the differences were not statistically significant (sensitivity: P=.10; specificity: P=.053). Performance across all CNN models as well as teledermatologists was influenced by image quality. Conclusions: CNNs trained on standardized images had improved performance and, therefore, greater generalizability in skin cancer classification when applied to unseen data sets. This finding is an important consideration for future algorithm development, regulation, and approval. SN - 2562-0959 UR - https://derma.www.mybigtv.com/2022/3/e35150 UR - https://doi.org/10.2196/35150 DO - 10.2196/35150 ID - info:doi/10.2196/35150 ER -
Baidu
map