Efficient Learning of Power Grid Voltage Control Strategies via Model-Based Deep Reinforcement Learning

Hossain, Ramij Raja; Yin, Tianzhixi; Du, Yan; Huang, Renke; Tan, Jie; Yu, Wenhao; Liu, Yuan; Huang, Qiuhua

doi:10.1007/S10994-023-06422-W

Efficient Learning of Power Grid Voltage Control Strategies via Model-Based Deep Reinforcement Learning

Ramij Raja Hossain, Tianzhixi Yin, Yan Du, Renke Huang, Jie Tan, Wenhao Yu, Yuan Liu, Qiuhua Huang

MLJ 2024 pp. 2675-2700

doi:10.1007/S10994-023-06422-W /mlj/2024/hossain2024mlj-efficient/

Abstract

This article proposes a model-based deep reinforcement learning (DRL) method to design emergency control strategies for short-term voltage stability problems in power systems. Recent advances show promising results for model-free DRL-based methods in power systems control problems. But in power systems applications, these model-free methods have certain issues related to training time (clock time) and sample efficiency; both are critical for making state-of-the-art DRL algorithms practically applicable. DRL-agent learns an optimal policy via a trial-and-error method while interacting with the real-world environment. It is also desirable to minimize the direct interaction of the DRL agent with the real-world power grid due to its safety-critical nature. Additionally, the state-of-the-art DRL-based policies are mostly trained using a physics-based grid simulator where dynamic simulation is computationally intensive, lowering the training efficiency. We propose a novel model-based DRL framework where a deep neural network (DNN)-based dynamic surrogate model (SM), instead of a real-world power grid or physics-based simulation, is utilized within the policy learning framework, making the process faster and more sample efficient. However, having stable training in model-based DRL is challenging because of the complex system dynamics of large-scale power systems. We addressed these issues by incorporating imitation learning to have a warm start in policy learning, reward-shaping, and multi-step loss in surrogate model training. Finally, we achieved 97.5% reduction in samples and 87.7% reduction in training time for an application to the IEEE 300-bus test system.

PDF MLJ Semantic Scholar

Cite

Text

Hossain et al. "Efficient Learning of Power Grid Voltage Control Strategies via Model-Based Deep Reinforcement Learning." Machine Learning, 2024. doi:10.1007/S10994-023-06422-W

Markdown

[Hossain et al. "Efficient Learning of Power Grid Voltage Control Strategies via Model-Based Deep Reinforcement Learning." Machine Learning, 2024.](https://mlanthology.org/mlj/2024/hossain2024mlj-efficient/) doi:10.1007/S10994-023-06422-W

BibTeX

@article{hossain2024mlj-efficient,
  title     = {{Efficient Learning of Power Grid Voltage Control Strategies via Model-Based Deep Reinforcement Learning}},
  author    = {Hossain, Ramij Raja and Yin, Tianzhixi and Du, Yan and Huang, Renke and Tan, Jie and Yu, Wenhao and Liu, Yuan and Huang, Qiuhua},
  journal   = {Machine Learning},
  year      = {2024},
  pages     = {2675-2700},
  doi       = {10.1007/S10994-023-06422-W},
  volume    = {113},
  url       = {https://mlanthology.org/mlj/2024/hossain2024mlj-efficient/}
}