LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects

Abstract

With the rapid rise of large language models (LLMs), phone automation has undergone transformative changes. This paper systematically reviews LLM-driven phone GUI agents, highlighting their evolution from script-based automation to intelligent, adaptive systems. We first contextualize key challenges, (i) limited generality, (ii) high maintenance overhead, and (iii) weak intent comprehension, and show how LLMs address these issues through advanced language understanding, multimodal perception, and robust decision-making. We then propose a taxonomy covering fundamental agent frameworks (single-agent, multi-agent, plan-then-act), modeling approaches (prompt engineering, training-based), and essential datasets and benchmarks. Furthermore, we detail task-specific architectures, supervised fine-tuning, and reinforcement learning strategies that bridge user intent and GUI operations. Finally, we discuss open challenges such as dataset diversity, on-device deployment efficiency, user-centric adaptation, and security concerns, offering forward-looking insights into this rapidly evolving field. By providing a structured overview and identifying pressing research gaps, this paper serves as a definitive reference for researchers and practitioners seeking to harness LLMs in designing scalable, user-friendly phone GUI agents. The collection of papers reviewed in this survey will be hosted and regularly updated on the GitHub repository: \url{https://github.com/PhoneLLM/Awesome-LLM-Powered-Phone-GUI-Agents}

Cite

Text

Liu et al. "LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects." Transactions on Machine Learning Research, 2025.

Markdown

[Liu et al. "LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/liu2025tmlr-llmpowered/)

BibTeX

@article{liu2025tmlr-llmpowered,
  title     = {{LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects}},
  author    = {Liu, Guangyi and Zhao, Pengxiang and Liang, Yaozhen and Liu, Liang and Guo, Yaxuan and Xiao, Han and Lin, Weifeng and Chai, Yuxiang and Han, Yue and Ren, Shuai and Wang, Hao and Liang, Xiaoyu and Wang, WenHao and Wu, Tianze and Lu, Zhengxi and Chen, Siheng and LiLinghao,  and Wang, Hao and Xiong, Guanjing and Liu, Yong and Li, Hongsheng},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/liu2025tmlr-llmpowered/}
}