Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence
Abstract
AI agents today are mostly siloed — they either retrieve and reason over vast amount of digital information and knowledge obtained online; or interact with the physical world through embodied perception, planning and action — but rarely both. This separation limits their ability to solve tasks that require integrated physical and digital intelligence, such as cooking from online recipes, navigating with dynamic map data, or interpreting real-world landmarks using web knowledge. We introduce \textsc{Embodied Web Agents}, a novel paradigm for AI agents that fluidly bridge embodiment and web-scale reasoning. To operationalize this concept, we first develop the \textsc{Embodied Web Agents} task environments, a unified simulation platform that integrates realistic 3D indoor and outdoor environments with functional web interfaces. Building upon this platform, we construct and release the \textsc{Embodied Web Agents} Benchmark, which encompasses a diverse suite of tasks including cooking, navigation, shopping, tourism, and geolocation — all requiring coordinated reasoning across physical and digital realms for systematic assessment of cross-domain intelligence. Experimental results reveal significant performance gaps between state-of-the-art AI systems and human capabilities, establishing both challenges and opportunities at the intersection of embodied cognition and web-scale knowledge access.
Cite
Text
Hong et al. "Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence." Advances in Neural Information Processing Systems, 2025.Markdown
[Hong et al. "Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/hong2025neurips-embodied/)BibTeX
@inproceedings{hong2025neurips-embodied,
title = {{Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence}},
author = {Hong, Yining and Sun, Rui and Li, Bingxuan and Yao, Xingcheng and Wu, Maxine and Chien, Alexander and Yin, Da and Wu, Ying Nian and Wang, Zhecan and Chang, Kai-Wei},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/hong2025neurips-embodied/}
}