Pareto Inverse Reinforcement Learning for Diverse Expert Policy Generation

Kim, Woo Kyung; Yoo, Minjong; Woo, Honguk

doi:10.24963/ijcai.2024/475

Pareto Inverse Reinforcement Learning for Diverse Expert Policy Generation

Woo Kyung Kim, Minjong Yoo, Honguk Woo

IJCAI 2024 pp. 4300-4307

doi:10.24963/ijcai.2024/475 /ijcai/2024/kim2024ijcai-pareto/

Abstract

In the data-driven era, collecting high-quality labeled data requiring human labor is a common approach for training data-hungry models, called crowdsourcing. Recently, end-to-end learning from crowds has shown its flexibility and practicality. However, existing works in an end-to-end manner focus on learning after collecting labels, which results in noisy annotations and also requires cost. Inspired by computerized adaptive testing, we argue that the characteristics of workers should be mined as soon as possible to make the best use of talents. To this end, we propose an adaptive learning from crowds method, AdaCrowd, as a cost-effective solution. Specifically, we propose a probabilistic model to capture the informativeness of possible instances for each worker. The informativeness is considered to be the uncertainty of the annotation prediction model output in its current status. The adaptive learning procedure is optimized by maximizing data likelihood and can be used with existing crowdsourcing models. Extensive experiments are conducted on real-world datasets, LabelMe and CIFAR-10H. The experimental results, e.g., the reduction of annotations without performance degradation, demonstrate the effectiveness.

PDF IJCAI Semantic Scholar

Cite

Text

Kim et al. "Pareto Inverse Reinforcement Learning for Diverse Expert Policy Generation." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/475

Markdown

[Kim et al. "Pareto Inverse Reinforcement Learning for Diverse Expert Policy Generation." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/kim2024ijcai-pareto/) doi:10.24963/ijcai.2024/475

BibTeX

@inproceedings{kim2024ijcai-pareto,
  title     = {{Pareto Inverse Reinforcement Learning for Diverse Expert Policy Generation}},
  author    = {Kim, Woo Kyung and Yoo, Minjong and Woo, Honguk},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {4300-4307},
  doi       = {10.24963/ijcai.2024/475},
  url       = {https://mlanthology.org/ijcai/2024/kim2024ijcai-pareto/}
}