GenComUI: Exploring Generative Visual Aids as Medium to Support Task-Oriented Human-Robot Communication

Yate Ge, Meiying Li, Xipeng Huang, Yuanda Hu, Qi Wang, Xiaohua Sun, Weiwei Guo

CHI’25

摘要/Abstract

这项工作研究了在人机任务通信中集成生成性视觉辅助工具的可行性。我们开发了 GenComUI 系统，该系统由大型语言模型（LLMs）驱动，能够动态生成上下文相关的视觉辅助工具——如地图注释、路径指示器和动画——以支持口头任务通信并促进为机器人生成定制化任务程序。该系统的设计灵感来源于一项形成性研究，该研究探讨了人类如何使用外部视觉工具来辅助空间任务中的口头通信。为了评估其有效性，我们进行了一项用户实验（n=20），将 GenComUI 与仅语音基线进行比较。结果表明，生成性视觉辅助工具通过定性和定量分析，通过提供持续的视觉反馈，增强了口头任务通信，从而促进了自然有效的机器人通信。此外，该研究还提供了一组设计启示，强调了动态生成的视觉辅助工具在人机交互中作为有效通信媒介的作用。这些发现强调了生成性视觉辅助工具在设计更直观和有效的机器人通信方面的潜力，特别是在人类与机器人交互以及基于 LLM 的最终用户开发中的复杂通信场景中。

This work investigates the integration of generative visual aids in human-robot task communication. We developed GenComUI, a system powered by large language models (LLMs) that dynamically generates contextual visual aids—such as map annotations, path indicators, and animations—to support verbal task communication and facilitate the generation of customized task programs for the robot. This system was informed by a formative study that examined how humans use external visual tools to assist verbal communication in spatial tasks. To evaluate its effectiveness, we conducted a user experiment (n = 20) comparing GenComUI with a voice-only baseline. The results demonstrate that generative visual aids, through both qualitative and quantitative analysis, enhance verbal task communication by providing continuous visual feedback, thus promoting natural and effective human-robot communication. Additionally, the study offers a set of design implications, emphasizing how dynamically generated visual aids can serve as an effective communication medium in human-robot interaction. These findings underscore the potential of generative visual aids to inform the design of more intuitive and effective human-robot communication, particularly for complex communication scenarios in human-robot interaction and LLM-based end-user development.

作者/Authors

链接/Link

Yate Ge, Meiying Li, Xipeng Huang, Yuanda Hu, Qi Wang, Xiaohua Sun, Weiwei Guo

https://dl.acm.org/doi/10.1145/3706598.3714238

数字创新中心

Center for Digital Innovation

GenComUI: Exploring Generative Visual Aids as Medium to Support Task-Oriented Human-Robot Communication

GenComUI: Exploring Generative Visual Aids as Medium to Support Task-Oriented Human-Robot Communication

摘要/Abstract

相关信息/Info

作者/Authors

链接/Link

图片/Figures