Automated Assessment of Fidelity and Interpretability: An Evaluation Framework for Large Language Models' Explanations (Student Abstract)

Abstract

As Large Language Models (LLMs) become more prevalent in various fields, it is crucial to rigorously assess the quality of their explanations. Our research introduces a task-agnostic framework for evaluating free-text rationales, drawing on insights from both linguistics and machine learning. We evaluate two dimensions of explainability: fidelity and interpretability. For fidelity, we propose methods suitable for proprietary LLMs where direct introspection of internal features is unattainable. For interpretability, we use language models instead of human evaluators, addressing concerns about subjectivity and scalability in evaluations. We apply our framework to evaluate GPT-3.5 and the impact of prompts on the quality of its explanations. In conclusion, our framework streamlines the evaluation of explanations from LLMs, promoting the development of safer models.

Cite

Text

Kuo et al. "Automated Assessment of Fidelity and Interpretability: An Evaluation Framework for Large Language Models' Explanations (Student Abstract)." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I21.30470

Markdown

[Kuo et al. "Automated Assessment of Fidelity and Interpretability: An Evaluation Framework for Large Language Models' Explanations (Student Abstract)." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/kuo2024aaai-automated/) doi:10.1609/AAAI.V38I21.30470

BibTeX

@inproceedings{kuo2024aaai-automated,
  title     = {{Automated Assessment of Fidelity and Interpretability: An Evaluation Framework for Large Language Models' Explanations (Student Abstract)}},
  author    = {Kuo, Mu-Tien and Hsueh, Chih-Chung and Tsai, Richard Tzong-Han},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {23554-23555},
  doi       = {10.1609/AAAI.V38I21.30470},
  url       = {https://mlanthology.org/aaai/2024/kuo2024aaai-automated/}
}