Text Compression as a Test for Artificial Intelligence
Abstract
It is shown that optimal text compression is a harder problem than artificial intelligence as defined by Turing’s (1950) imitation game; thus compression ratio on a standard benchmark corpus could be used as an objective and quantitative alternative test for AI (Mahoney, 1999). Specifically, let L, M, and J be the probability distributions of responses chosen by a human, machine, and human judge respectively to the judge’s questions in the imitation game. The goal of AI is M = L, the machine is indistinguishable from human. But the machine wins (the judge guesses that it is human) when HJ(M) < HJ(L), where HQ(P) ≡ −Σx P(x) log Q(x) is the cross entropy of Q with respect to P. This happens when J is a poor estimate of L, meaning that the interrogator fails to anticipate the human’s responses, but even in the worst case when J = L, the machine can still win with a suboptimal solution (M ≠ L) by deterministically favoring the most likely responses over the true distribution. In contrast, optimal compression of a probabilistic language L with unknown distribution (such as English) using an estimated distribution M (an encoding of length −log2 M(x) bits for each string x) is M = L, by the discrete channel capacity theorem (Shannon, 1949). Answering questions in the Turing test (What are roses?) seems to require the same type of real-world knowledge that people use in predicting characters in a stream of natural language text (Roses are ___?), or equivalently, estimating L(x) for compression. Shannon (1951), and Cover and King (1978) established an upper bound of 1.3 bits per character (bpc) for the entropy (information content) of English narrative in a 27-character alphabet (A-Z and space) using human prediction tests. No compression program has achieved this. Eight programs, including
Cite
Text
Mahoney. "Text Compression as a Test for Artificial Intelligence." AAAI Conference on Artificial Intelligence, 1999.Markdown
[Mahoney. "Text Compression as a Test for Artificial Intelligence." AAAI Conference on Artificial Intelligence, 1999.](https://mlanthology.org/aaai/1999/mahoney1999aaai-text/)BibTeX
@inproceedings{mahoney1999aaai-text,
title = {{Text Compression as a Test for Artificial Intelligence}},
author = {Mahoney, Matthew V.},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {1999},
pages = {970},
url = {https://mlanthology.org/aaai/1999/mahoney1999aaai-text/}
}