Using Text Mining and Link Analysis for Software Mining

Abstract

Many data mining techniques are these days in use for ontology learning – text mining, Web mining, graph mining, link analysis, relational data mining, and so on. In the current state-of-the-art bundle there is a lack of “software mining” techniques. This term denotes the process of extracting knowledge out of source code. In this paper we approach the software mining task with a combination of text mining and link analysis techniques. We discuss how each instance (i.e. a programming construct such as a class or a method) can be converted into a feature vector that combines the information about how the instance is interlinked with other instances, and the information about its (textual) content. The so-obtained feature vectors serve as the basis for the construction of the domain ontology with OntoGen, an existing system for semi-automatic data-driven ontology construction.

Cite

Text

Grcar et al. "Using Text Mining and Link Analysis for Software Mining." European Conference on Machine Learning, 2007. doi:10.1007/978-3-540-68416-9_1

Markdown

[Grcar et al. "Using Text Mining and Link Analysis for Software Mining." European Conference on Machine Learning, 2007.](https://mlanthology.org/ecmlpkdd/2007/grcar2007ecml-using/) doi:10.1007/978-3-540-68416-9_1

BibTeX

@inproceedings{grcar2007ecml-using,
  title     = {{Using Text Mining and Link Analysis for Software Mining}},
  author    = {Grcar, Miha and Grobelnik, Marko and Mladenic, Dunja},
  booktitle = {European Conference on Machine Learning},
  year      = {2007},
  pages     = {1-12},
  doi       = {10.1007/978-3-540-68416-9_1},
  url       = {https://mlanthology.org/ecmlpkdd/2007/grcar2007ecml-using/}
}