Banksformer: A Deep Generative Model for Synthetic Transaction Sequences

Abstract

Synthetic data are generated data that closely model real- world measurements, and can be a valuable substitute for real data in domains where it is costly to obtain real data or privacy concerns exist. Synthetic data has traditionally been generated using computational simulations, but deep generative models (DGMs) are increasingly used to create high-quality synthetic data. In this work, we tackle the problem of generating synthetic, multivariate sequences of banking transactions. A key challenge in modeling transactional sequences with DGMs is that transactions occur at irregular intervals and may depend on timestamp-based features, such as the time of day or day of the week. Relationships between date-based features are often poorly represented in data generated using state-of-the-art sequence DGMs, such as DoppelGANger [ 17 ] and TimeGAN [ 31 ]. To remedy this, we propose a novel DGM, called Banksformer (Code available at github.com/BigTuna08/Banksformer_ecml_2022 ), which is able to emulate date-based patterns found in transactional data significantly better than other DGMs. We demonstrate Banksformers’ ability to generate high-quality synthetic sequences of banking transactions by conducting a multi-faceted evaluation that compares synthetic data generated by Banksformer to data from other comparable DGMs, across two datasets of banking transactions.

Cite

Text

Nickerson et al. "Banksformer: A Deep Generative Model for Synthetic Transaction Sequences." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2022. doi:10.1007/978-3-031-26422-1_8

Markdown

[Nickerson et al. "Banksformer: A Deep Generative Model for Synthetic Transaction Sequences." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2022.](https://mlanthology.org/ecmlpkdd/2022/nickerson2022ecmlpkdd-banksformer/) doi:10.1007/978-3-031-26422-1_8

BibTeX

@inproceedings{nickerson2022ecmlpkdd-banksformer,
  title     = {{Banksformer: A Deep Generative Model for Synthetic Transaction Sequences}},
  author    = {Nickerson, Kyle L. and Tricco, Terrence S. and Kolokolova, Antonina and Shoeleh, Farzaneh and Robertson, Charles and Hawkin, John and Hu, Ting},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2022},
  pages     = {121-136},
  doi       = {10.1007/978-3-031-26422-1_8},
  url       = {https://mlanthology.org/ecmlpkdd/2022/nickerson2022ecmlpkdd-banksformer/}
}