Recognition of Agents Based on Observation of Their Sequential Behavior

Abstract

We study the use of inverse reinforcement learning (IRL) as a tool for recognition of agents on the basis of observation of their sequential decision behavior. We model the problem faced by the agents as a Markov decision process (MDP) and model the observed behavior of an agent in terms of forward planning for the MDP. The reality of the agent’s decision problem and process may not be expressed by the MDP and its policy, but we interpret the observation as optimal actions in the MDP. We use IRL to learn reward functions for the MDP and then use these reward functions as the basis for clustering or classification models. Experimental studies with GridWorld , a navigation problem, and the secretary problem , an optimal stopping problem, show algorithms’ performance in different learning scenarios for agent recognition where the agents’ underlying decision strategy may be expressed by the MDP policy or not. Empirical comparisons of our method with several existing IRL algorithms and with direct methods that use feature statistics observed in state-action space suggest it may be superior for agent recognition problems, particularly when the state space is large but the length of the observed decision trajectory is small.

Cite

Text

Qiao and Beling. "Recognition of Agents Based on Observation of Their Sequential Behavior." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2013. doi:10.1007/978-3-642-40988-2_3

Markdown

[Qiao and Beling. "Recognition of Agents Based on Observation of Their Sequential Behavior." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2013.](https://mlanthology.org/ecmlpkdd/2013/qiao2013ecmlpkdd-recognition/) doi:10.1007/978-3-642-40988-2_3

BibTeX

@inproceedings{qiao2013ecmlpkdd-recognition,
  title     = {{Recognition of Agents Based on Observation of Their Sequential Behavior}},
  author    = {Qiao, Qifeng and Beling, Peter A.},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2013},
  pages     = {33-48},
  doi       = {10.1007/978-3-642-40988-2_3},
  url       = {https://mlanthology.org/ecmlpkdd/2013/qiao2013ecmlpkdd-recognition/}
}