AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents

Abstract

We introduce AgentSynth, a scalable and cost-efficient pipeline for automatically synthesizing high-quality tasks and trajectory datasets for generalist computer-use agents. Leveraging information asymmetry, AgentSynth constructs subtasks that are simple during generation but significantly more challenging when composed into long-horizon tasks, enabling the creation of over 6,000 diverse and realistic tasks. A key strength of AgentSynth is its ability to precisely modulate task complexity by varying the number of subtasks. Empirical evaluations show that state-of-the-art LLM agents suffer a steep performance drop, from 18\% success at difficulty level 1 to just 4\% at level 6, highlighting the benchmark's difficulty and discriminative power. Moreover, our pipeline achieves a low average cost of \$0.60 per trajectory, orders of magnitude cheaper than human annotations. Our code and data are available at https://github.com/sunblaze-ucb/AgentSynth

Cite

Text

Xie et al. "AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents." International Conference on Learning Representations, 2026.

Markdown

[Xie et al. "AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/xie2026iclr-agentsynth/)

BibTeX

@inproceedings{xie2026iclr-agentsynth,
  title     = {{AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents}},
  author    = {Xie, Jingxu and Xu, Dylan and Zhao, Xuandong and Song, Dawn},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/xie2026iclr-agentsynth/}
}