Maximum Entropy Markov Models for Information Extraction and Segmentation
Abstract
Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many text-related tasks, such as part-of-speech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial distributions over a discrete vocabulary, and the HMM parameters are set to maximize the likelihood of the observations. This paper presents a new Markovian sequence model, closely related to HMMs, that allows observations to be represented as arbitrary overlapping features (such as word, capitalization, formatting, part-of-speech), and defines the conditional probability of state sequences given observation sequences. It does this by using the maximum entropy framework to fit a set of exponential models that represent the probability of a state given an observation and the previous state. We present positive experimental results on the segmentation of FAQ's. 1. Introdu...
Cite
Text
McCallum et al. "Maximum Entropy Markov Models for Information Extraction and Segmentation." International Conference on Machine Learning, 2000.Markdown
[McCallum et al. "Maximum Entropy Markov Models for Information Extraction and Segmentation." International Conference on Machine Learning, 2000.](https://mlanthology.org/icml/2000/mccallum2000icml-maximum/)BibTeX
@inproceedings{mccallum2000icml-maximum,
title = {{Maximum Entropy Markov Models for Information Extraction and Segmentation}},
author = {McCallum, Andrew and Freitag, Dayne and Pereira, Fernando C. N.},
booktitle = {International Conference on Machine Learning},
year = {2000},
pages = {591-598},
url = {https://mlanthology.org/icml/2000/mccallum2000icml-maximum/}
}