MICRO: Model-Based Offline Reinforcement Learning with a Conservative Bellman Operator

Abstract

Reasoning about causality and agent causal knowledge is critical for effective decision-making and planning in multi-agent contexts. Previous work in the area generally assumes that the domain is deterministic, but in fact many agents operate in nondeterministic domains where the outcome of their actions depends on unpredictable environment reactions. In this paper, we propose a situation calculus-based framework for reasoning about causal knowledge in nondeterministic domains. In such domains, the agent may not know the environment reactions to her actions and their outcomes, and may be uncertain about which actions caused a condition to come about. But she can perform sensing actions to acquire knowledge about the state and use it to gain knowledge about causes. Our formalization recognizes sensing actions as causes of both physical and epistemic effects. We also examine how regression can be used to reason about causal knowledge.

Cite

Text

Liu et al. "MICRO: Model-Based Offline Reinforcement Learning with a Conservative Bellman Operator." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/507

Markdown

[Liu et al. "MICRO: Model-Based Offline Reinforcement Learning with a Conservative Bellman Operator." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/liu2024ijcai-micro/) doi:10.24963/ijcai.2024/507

BibTeX

@inproceedings{liu2024ijcai-micro,
  title     = {{MICRO: Model-Based Offline Reinforcement Learning with a Conservative Bellman Operator}},
  author    = {Liu, Xiao-Yin and Zhou, Xiao-Hu and Li, Guotao and Li, Hao and Gui, Mei-Jiang and Xiang, Tian-Yu and Huang, De-Xing and Hou, Zeng-Guang},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {4587-4595},
  doi       = {10.24963/ijcai.2024/507},
  url       = {https://mlanthology.org/ijcai/2024/liu2024ijcai-micro/}
}