A Simulation-Based Evaluation Framework for Interactive AI Systems and Its Application

Hanafi, Maeda F.; Katsis, Yannis; Cooper, Martín Santillán; Li, Yunyao

doi:10.1609/AAAI.V36I11.21541

A Simulation-Based Evaluation Framework for Interactive AI Systems and Its Application

Maeda F. Hanafi, Yannis Katsis, Martín Santillán Cooper, Yunyao Li

AAAI 2022 pp. 12658-12664

doi:10.1609/AAAI.V36I11.21541 /aaai/2022/hanafi2022aaai-simulation/

Abstract

Interactive AI (IAI) systems are increasingly popular as the human-centered AI design paradigm is gaining strong traction. However, evaluating IAI systems, a key step in building such systems, is particularly challenging, as their output highly depends on the performed user actions. Developers often have to rely on limited and mostly qualitative data from ad-hoc user testing to assess and improve their systems. In this paper, we present InteractEva; a systematic evaluation framework for IAI systems. We also describe how we have applied InteractEva to evaluate a commercial IAI system, leading to both quality improvements and better data-driven design decisions.

PDF AAAI Semantic Scholar

Cite

Text

Hanafi et al. "A Simulation-Based Evaluation Framework for Interactive AI Systems and Its Application." AAAI Conference on Artificial Intelligence, 2022. doi:10.1609/AAAI.V36I11.21541

Markdown

[Hanafi et al. "A Simulation-Based Evaluation Framework for Interactive AI Systems and Its Application." AAAI Conference on Artificial Intelligence, 2022.](https://mlanthology.org/aaai/2022/hanafi2022aaai-simulation/) doi:10.1609/AAAI.V36I11.21541

BibTeX

@inproceedings{hanafi2022aaai-simulation,
  title     = {{A Simulation-Based Evaluation Framework for Interactive AI Systems and Its Application}},
  author    = {Hanafi, Maeda F. and Katsis, Yannis and Cooper, Martín Santillán and Li, Yunyao},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2022},
  pages     = {12658-12664},
  doi       = {10.1609/AAAI.V36I11.21541},
  url       = {https://mlanthology.org/aaai/2022/hanafi2022aaai-simulation/}
}