A Bias-Free Revenue-Maximizing Bidding Strategy for Data Consumers in Auction-Based Federated Learning

Abstract

Entropy Regularisation is a widely adopted technique that enhances policy optimisation performance and stability. Maximum entropy reinforcement learning (MaxEnt RL) regularises policy evaluation by augmenting the objective with an entropy term, showing theoretical benefits in policy optimisation. However, its practical application in straightforward direct policy gradient settings remains surprisingly underexplored. We hypothesise that this is due to the difficulty of managing the entropy reward in practice. This paper proposes Entropy Advantage Policy Optimisation (EAPO), a simple method that facilitates MaxEnt RL implementation by separately estimating task and entropy objectives. Our empirical evaluations demonstrate that extending Proximal Policy Optimisation (PPO) and Trust Region Policy Optimisation (TRPO) within the MaxEnt framework improves optimisation performance, generalisation, and exploration in various environments. Moreover, our method provides a stable and performant MaxEnt RL algorithm for discrete action spaces.

Cite

Text

Tang et al. "A Bias-Free Revenue-Maximizing Bidding Strategy for Data Consumers in Auction-Based Federated Learning." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/552

Markdown

[Tang et al. "A Bias-Free Revenue-Maximizing Bidding Strategy for Data Consumers in Auction-Based Federated Learning." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/tang2024ijcai-bias/) doi:10.24963/ijcai.2024/552

BibTeX

@inproceedings{tang2024ijcai-bias,
  title     = {{A Bias-Free Revenue-Maximizing Bidding Strategy for Data Consumers in Auction-Based Federated Learning}},
  author    = {Tang, Xiaoli and Yu, Han and Li, Zengxiang and Li, Xiaoxiao},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {4991-4999},
  doi       = {10.24963/ijcai.2024/552},
  url       = {https://mlanthology.org/ijcai/2024/tang2024ijcai-bias/}
}