@文章{info:doi/10.2196/18301,作者="Abd-Alrazaq, Alaa和Safi, Zeineb和Alajlani, Mohannad和Warren, Jim和housh, Mowafa和Denecke, Kerstin",标题="用于评估医疗聊天机器人的技术指标:范围综述",期刊="J Med Internet Res",年="2020",月="六月",日="5",卷="22",数="6",页="e18301",关键词="聊天机器人;会话代理商;卫生保健;评估;背景:对话代理(聊天机器人)在医疗保健领域有很长的应用历史,它们被用于支持患者自我管理和提供咨询等任务。随着对卫生系统需求的增加和人工智能(AI)能力的提高,它们的使用预计将增长。然而,评估医疗聊天机器人的方法似乎是多样化的和随意的,这导致了该领域发展的潜在障碍。目的:本研究旨在确定技术(非临床)指标用于先前的研究,以评估医疗聊天机器人。方法:通过检索7个文献数据库(如MEDLINE和psyinfo),并对纳入的研究和相关综述进行前后参考文献列表检查,确定研究。这些研究是由两名审稿人独立选择的,然后从纳入的研究中提取数据。 Extracted data were synthesized narratively by grouping the identified metrics into categories based on the aspect of chatbots that the metrics evaluated. Results: Of the 1498 citations retrieved, 65 studies were included in this review. Chatbots were evaluated using 27 technical metrics, which were related to chatbots as a whole (eg, usability, classifier performance, speed), response generation (eg, comprehensibility, realism, repetitiveness), response understanding (eg, chatbot understanding as assessed by users, word error rate, concept error rate), and esthetics (eg, appearance of the virtual agent, background color, and content). Conclusions: The technical metrics of health chatbot studies were diverse, with survey designs and global usability metrics dominating. The lack of standardization and paucity of objective measures make it difficult to compare the performance of health chatbots and could inhibit advancement of the field. We suggest that researchers more frequently include metrics computed from conversation logs. In addition, we recommend the development of a framework of technical metrics with recommendations for specific circumstances for their inclusion in chatbot studies. ", issn="1438-8871", doi="10.2196/18301", url="//www.mybigtv.com/2020/6/e18301/", url="https://doi.org/10.2196/18301", url="http://www.ncbi.nlm.nih.gov/pubmed/32442157" }
Baidu
map