用于评估医疗保健聊天机器人的技术指标:卡塔尔世界杯8强波胆分析Scoping Review %A Abd-Alrazaq,Alaa %A Safi,Zeineb %A Alajlani,Mohannad %A Warren,Jim %A Househ,Mowafa %A Denecke,Kerstin %+伯尔尼应用科学大学医学信息研究所,Quellgasse 21,2502 Biel,伯尔尼,瑞士,41 76 409 97 61,kerstin.denecke@bfh.ch %K聊天机器人%K会话代理%K医疗保健%K评估%K指标%D 2020 %7 5.6.2020 %9 Review %J J Med Internet Res %G English %X背景:对话代理(聊天机器人)在医疗保健领域有着悠久的应用历史,它们被用于支持患者自我管理和提供咨询等任务。随着对卫生系统需求的增加和人工智能(AI)能力的提高,它们的使用预计将会增长。然而,评估医疗聊天机器人的方法似乎是多样化和随意的,这对该领域的发展造成了潜在的障碍。目的:本研究旨在确定先前研究中用于评估医疗聊天机器人的技术(非临床)指标。方法:检索MEDLINE和PsycINFO等7个文献数据库,并对纳入的研究和相关综述进行前向和前向参考文献列表检查。研究由两名审稿人独立选择,然后从纳入的研究中提取数据。提取的数据通过将识别的指标分组到基于指标评估的聊天机器人方面的类别来进行叙述合成。结果:在检索到的1498篇引文中,有65篇研究被纳入本综述。 Chatbots were evaluated using 27 technical metrics, which were related to chatbots as a whole (eg, usability, classifier performance, speed), response generation (eg, comprehensibility, realism, repetitiveness), response understanding (eg, chatbot understanding as assessed by users, word error rate, concept error rate), and esthetics (eg, appearance of the virtual agent, background color, and content). Conclusions: The technical metrics of health chatbot studies were diverse, with survey designs and global usability metrics dominating. The lack of standardization and paucity of objective measures make it difficult to compare the performance of health chatbots and could inhibit advancement of the field. We suggest that researchers more frequently include metrics computed from conversation logs. In addition, we recommend the development of a framework of technical metrics with recommendations for specific circumstances for their inclusion in chatbot studies. %M 32442157 %R 10.2196/18301 %U //www.mybigtv.com/2020/6/e18301/ %U https://doi.org/10.2196/18301 %U http://www.ncbi.nlm.nih.gov/pubmed/32442157
Baidu
map