State of the Art: Reproducibility in Artificial Intelligence

Abstract

Background: Research results in artificial intelligence (AI) are criticized for not being reproducible. Objective: To quantify the state of reproducibility of empirical AI research using six reproducibility metrics measuring three different degrees of reproducibility. Hypotheses: 1) AI research is not documented well enough to reproduce the reported results. 2) Documentation practices have improved over time. Method: The literature is reviewed and a set of variables that should be documented to enable reproducibility are grouped into three factors: Experiment, Data and Method. The metrics describe how well the factors have been documented for a paper. A total of 400 research papers from the conference series IJCAI and AAAI have been surveyed using the metrics. Findings: None of the papers document all of the variables. The metrics show that between 20% and 30% of the variables for each factor are documented. One of the metrics show statistically significant increase over time while the others show no change. Interpretation: The reproducibility scores decrease with in- creased documentation requirements. Improvement over time is found. Conclusion: Both hypotheses are supported.

Cite

Text

Gundersen and Kjensmo. "State of the Art: Reproducibility in Artificial Intelligence." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.11503

Markdown

[Gundersen and Kjensmo. "State of the Art: Reproducibility in Artificial Intelligence." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/gundersen2018aaai-state/) doi:10.1609/AAAI.V32I1.11503

BibTeX

@inproceedings{gundersen2018aaai-state,
  title     = {{State of the Art: Reproducibility in Artificial Intelligence}},
  author    = {Gundersen, Odd Erik and Kjensmo, Sigbjørn},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {1644-1651},
  doi       = {10.1609/AAAI.V32I1.11503},
  url       = {https://mlanthology.org/aaai/2018/gundersen2018aaai-state/}
}