ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles

Zhao, Kai; Hao, Jianye; Ma, Yi; Liu, Jinyi; Zheng, Yan; Meng, Zhaopeng

doi:10.24963/ijcai.2024/615

ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles

Kai Zhao, Jianye Hao, Yi Ma, Jinyi Liu, Yan Zheng, Zhaopeng Meng

IJCAI 2024 pp. 5563-5571

doi:10.24963/ijcai.2024/615 /ijcai/2024/zhao2024ijcai-enoto/

Abstract

Dynamic edge networks revolutionize mobile edge computing by enabling real-time applications in intelligent transportation, augmented reality, and industrial Internet of Things (IoT). Efficient workload offloading in dynamic edge networks is crucial for addressing the increasing demands of time-varying workloads while contending with limited computational and communication resources. Existing deep reinforcement learning (DRL)-based offloading decision-making schemes are inadequate for managing scenarios involving multiple workloads and edge servers, particularly when faced with time-varying workload arrivals and fluctuating channel states. To this end, we propose a flexible module weighted fusion DRL framework (DRL-MWF) for scalable and robust multi-workload offloading in edge environments. Unlike traditional monolithic networks, DRL-MWF employs a weighted fusion modular architecture that adapts flexibly to diverse workload distributions. Specifically, DRL-MWF introduces a state representation and normalization strategy to model state and workload characteristics, enabling precise and adaptive decision-making. Furthermore, we design two key mechanisms: a weighted policy correction method to stabilize learning and a prioritized experience replay with weighted importance sampling to accelerate convergence by emphasizing critical transitions. Extensive evaluations on real-world datasets demonstrate that DRL-MWF consistently outperforms state-of-the-art baselines. These results reveal DRL-MWF's potential to transform workload offloading in next-generation edge computing systems, ensuring high performance in dynamic scenarios.

PDF IJCAI Semantic Scholar

Cite

Text

Zhao et al. "ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/615

Markdown

[Zhao et al. "ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/zhao2024ijcai-enoto/) doi:10.24963/ijcai.2024/615

BibTeX

@inproceedings{zhao2024ijcai-enoto,
  title     = {{ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles}},
  author    = {Zhao, Kai and Hao, Jianye and Ma, Yi and Liu, Jinyi and Zheng, Yan and Meng, Zhaopeng},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {5563-5571},
  doi       = {10.24963/ijcai.2024/615},
  url       = {https://mlanthology.org/ijcai/2024/zhao2024ijcai-enoto/}
}