The Impact of Time on the Accuracy of Sentiment Classifiers Created from a Web Log Corpus
Abstract
We investigate the impact of time on the predictability of sentiment classification research for models created from web logs. We show that sentiment classifiers are time de-pendent and through a series of methodical experiments quantify the size of the dependence. In particular, we meas-ure the accuracies of 25 different time-specific sentiment classifiers on 24 different testing timeframes. We use the Naive Bayes induction technique and the holdout validation technique using equal-sized but separate training and testing data sets. We conducted over 600 experiments and organize our results by the size of the interval (in months) between the training and testing timeframes. Our findings show a significant decrease in accuracy as this interval grows. Us-ing a paired t-test we show classifiers trained on future data and tested on past data significantly outperform classifiers trained on past data and tested on future data. These find-ings are for a topic-specific corpus created from political web log posts originating from 160 different web logs. We then define concepts that classify months as exemplar, in-frequent thread, frequent thread or outlier; this classification reveals knowledge on the topic’s evolution and the utility of the month’s data for the timeframe.
Cite
Text
Durant and Smith. "The Impact of Time on the Accuracy of Sentiment Classifiers Created from a Web Log Corpus." AAAI Conference on Artificial Intelligence, 2007.Markdown
[Durant and Smith. "The Impact of Time on the Accuracy of Sentiment Classifiers Created from a Web Log Corpus." AAAI Conference on Artificial Intelligence, 2007.](https://mlanthology.org/aaai/2007/durant2007aaai-impact/)BibTeX
@inproceedings{durant2007aaai-impact,
title = {{The Impact of Time on the Accuracy of Sentiment Classifiers Created from a Web Log Corpus}},
author = {Durant, Kathleen T. and Smith, Michael D.},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2007},
pages = {1340-1346},
url = {https://mlanthology.org/aaai/2007/durant2007aaai-impact/}
}