How Latent Is Latent Semantic Analysis?
Abstract
Latent Semantic Analysis (LSA) is a statistical, corpus-based text comparison mechanism that was originally developed for the task of information retrieval, but in recent years has produced remarkably human-like abilities in a variety of language tasks. LSA has taken the Test of English as a Foreign Language and performed as well as non-native English speakers who were successful college applicants. It has shown an ability to learn words at a rate similar to humans. It has even graded papers as reliably as human graders. We have used LSA as a mechanism for evaluating the quality of student responses in an intelligent tutoring system, and its performance equals that of human raters with intermediate domain knowledge. It has been claimed that LSA’s text-comparison abilities stem primarily from its use of a statistical technique called singular value decomposition (SVD) which compresses a large amount of term and document co-occurrence information into a smaller space. This compression is said to capture the semantic information that is latent in the corpus itself. We test this claim by comparing LSA to a version of LSA without
Cite
Text
Wiemer-Hastings. "How Latent Is Latent Semantic Analysis?." International Joint Conference on Artificial Intelligence, 1999.Markdown
[Wiemer-Hastings. "How Latent Is Latent Semantic Analysis?." International Joint Conference on Artificial Intelligence, 1999.](https://mlanthology.org/ijcai/1999/wiemerhastings1999ijcai-latent/)BibTeX
@inproceedings{wiemerhastings1999ijcai-latent,
title = {{How Latent Is Latent Semantic Analysis?}},
author = {Wiemer-Hastings, Peter M.},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {1999},
pages = {932-941},
url = {https://mlanthology.org/ijcai/1999/wiemerhastings1999ijcai-latent/}
}