DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow Graphs
Abstract
We study the problem of assigning operations in a dataflow graph to devices to minimize execution time in a work-conserving system, with emphasis on complex machine learning workloads. Prior learning-based approaches face three limitations: (1) reliance on bulk-synchronous frameworks that under-utilize devices, (2) learning a single placement policy without modeling the system dynamics, and (3) depending solely on reinforcement learning during pre-training while ignoring optimization during deployment. We propose Doppler, a three-stage framework with two policies—$\mathsf{SEL}$ for selecting operations and $\mathsf{PLC}$ for placing them on devices. Doppler consistently outperforms baselines by reducing execution time and improving sampling efficiency through faster per-episode training. Our results show that Doppler achieves up to 52.7\% lower execution times than the best baseline. The code is available at https://github.com/xinyuyao/Doppler.
Cite
Text
Yao et al. "DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow Graphs." International Conference on Learning Representations, 2026.Markdown
[Yao et al. "DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow Graphs." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/yao2026iclr-doppler/)BibTeX
@inproceedings{yao2026iclr-doppler,
title = {{DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow Graphs}},
author = {Yao, Xinyu and Bourgeois, Daniel and Jain, Abhinav and Tang, Yuxin and Yao, Jiawen and Ding, Zhimin and Silva, Arlei and Jermaine, Chris},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/yao2026iclr-doppler/}
}