Topics in Financial Filings and Bankruptcy Prediction with Distributed Representations of Textual Data
Abstract
We uncover latent topics embedded in the management discussion and analysis (MD&A) of financial reports from the listed companies in the US, and we examine the evolution of topics found by a dynamic topic modelling method - Dynamic Embedding Topic Model. Using more than 203k reports with 40M sentences ranging from 1997 to 2017, we find 30 interpretable topics. The evolution of topics follows economics cycles and major industrial events. We validate the significance of these latent topics by the state-of-the-art performance of a simple bankruptcy ensemble classifier trained on both novel features - topical distributed representation of the MD&A, and accounting features.
Cite
Text
Nguyen et al. "Topics in Financial Filings and Bankruptcy Prediction with Distributed Representations of Textual Data." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2020. doi:10.1007/978-3-030-67670-4_19Markdown
[Nguyen et al. "Topics in Financial Filings and Bankruptcy Prediction with Distributed Representations of Textual Data." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2020.](https://mlanthology.org/ecmlpkdd/2020/nguyen2020ecmlpkdd-topics/) doi:10.1007/978-3-030-67670-4_19BibTeX
@inproceedings{nguyen2020ecmlpkdd-topics,
title = {{Topics in Financial Filings and Bankruptcy Prediction with Distributed Representations of Textual Data}},
author = {Nguyen, Ba-Hung and Shirai, Kiyoaki and Huynh, Van-Nam},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2020},
pages = {306-322},
doi = {10.1007/978-3-030-67670-4_19},
url = {https://mlanthology.org/ecmlpkdd/2020/nguyen2020ecmlpkdd-topics/}
}