Investigating the Ability of Large Language Models to Explain Causal Relationships in Time Series Data

Abstract

Large language models (LLMs) can enhance how individuals interact with and process information from large amounts of data. In many settings, the ability to explain the causal reasons behind observations in data is important. In this work, we investigate the ability of LLMs to provide accurate explanations about causal relationships in time series data. We generated synthetic datasets based on three distinct directed acyclic graphs (DAGs) representing causal relationships between multiple time series variables, and we evaluated how state-of-the-art LLMs answer questions related to causal effects within the observed data. Initially, we used abstract variable names in the analysis and later assigned real-world meanings to these variables to align with the DAG structures. We tested how accurately the LLMs identified the variables that caused specific observations in an outcome variable and found shortcomings with state-of-the-art models. We highlight challenges and opportunities for research in this space.

Cite

Text

Healey and Kohane. "Investigating the Ability of Large Language Models to Explain Causal Relationships in Time Series Data." NeurIPS 2024 Workshops: CALM, 2024.

Markdown

[Healey and Kohane. "Investigating the Ability of Large Language Models to Explain Causal Relationships in Time Series Data." NeurIPS 2024 Workshops: CALM, 2024.](https://mlanthology.org/neuripsw/2024/healey2024neuripsw-investigating/)

BibTeX

@inproceedings{healey2024neuripsw-investigating,
  title     = {{Investigating the Ability of Large Language Models to Explain Causal Relationships in Time Series Data}},
  author    = {Healey, Elizabeth and Kohane, Isaac S.},
  booktitle = {NeurIPS 2024 Workshops: CALM},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/healey2024neuripsw-investigating/}
}