Kernelized Sorting for Natural Language Processing

Abstract

Kernelized sorting is an approach for matching objects from two sources (or domains) that does not require any prior notion of similarity between objects across the two sources. Unfortunately, this technique is highly sensitive to initialization and high dimensional data. We present variants of kernelized sorting to increase its robustness and performance on several Natural Language Processing (NLP) tasks: document matching from parallel and comparable corpora, machine transliteration and even image processing. Empirically we show that, on these tasks, a semi-supervised variant of kernelized sorting outperforms matching canonical correlation analysis.

Cite

Text

Jagarlamudi et al. "Kernelized Sorting for Natural Language Processing." AAAI Conference on Artificial Intelligence, 2010. doi:10.1609/AAAI.V24I1.7718

Markdown

[Jagarlamudi et al. "Kernelized Sorting for Natural Language Processing." AAAI Conference on Artificial Intelligence, 2010.](https://mlanthology.org/aaai/2010/jagarlamudi2010aaai-kernelized/) doi:10.1609/AAAI.V24I1.7718

BibTeX

@inproceedings{jagarlamudi2010aaai-kernelized,
  title     = {{Kernelized Sorting for Natural Language Processing}},
  author    = {Jagarlamudi, Jagadeesh and Juarez, Seth and Iii, Hal Daumé},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2010},
  pages     = {1020-1025},
  doi       = {10.1609/AAAI.V24I1.7718},
  url       = {https://mlanthology.org/aaai/2010/jagarlamudi2010aaai-kernelized/}
}