DeFraudNet: An End-to-End Weak Supervision Framework to Detect Fraud in Online Food Delivery

Mathew, Jose; Negi, Meghana; Vijjali, Rutvik; Sathyanarayana, Jairaj

doi:10.1007/978-3-030-86514-6_6

DeFraudNet: An End-to-End Weak Supervision Framework to Detect Fraud in Online Food Delivery

Jose Mathew, Meghana Negi, Rutvik Vijjali, Jairaj Sathyanarayana

ECML-PKDD 2021 pp. 85-99

doi:10.1007/978-3-030-86514-6_6 /ecmlpkdd/2021/mathew2021ecmlpkdd-defraudnet/

Abstract

Detecting abusive and fraudulent claims is one of the key challenges in online food delivery. This is further aggravated by the fact that it is not practical to do reverse-logistics on food unlike in e-commerce. This makes the already-hard problem of harvesting labels for fraud even harder because we cannot confirm if the claim was legitimate by inspecting the item(s). Using manual effort to analyze transactions to generate labels is often expensive and time-consuming. On the other hand, typically, there is a wealth of ‘noisy’ information about what constitutes fraud, in the form of customer service interactions, weak and hard rules derived from data analytics, business intuition and domain understanding. In this paper, we present a novel end-to-end framework for detecting fraudulent transactions based on large-scale label generation using weak supervision. We directly use Stanford AI Lab’s (SAIL) Snorkel and tree based methods to do manual and automated discovery of labeling functions, to generate weak labels. We follow this up with an auto-encoder reconstruction-error based method to reduce label noise. The final step is a discriminator model which is an ensemble of an MLP and an LSTM. In addition to cross-sectional and longitudinal features around customer history, transactions, we also harvest customer embeddings from a Graph Convolution Network (GCN) on a customer-customer relationship graph, to capture collusive behavior. The final score is thresholded and used in decision making. This solution is currently deployed for real-time serving and has yielded a 16% points’ improvement in recall at a given precision level. These results are against a baseline MLP model based on manually labeled data and are highly significant at our scale. Our approach can easily scale to additional fraud scenarios or to use-cases where ‘strong’ labels are hard to get but weak labels are prevalent.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Mathew et al. "DeFraudNet: An End-to-End Weak Supervision Framework to Detect Fraud in Online Food Delivery." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2021. doi:10.1007/978-3-030-86514-6_6

Markdown

[Mathew et al. "DeFraudNet: An End-to-End Weak Supervision Framework to Detect Fraud in Online Food Delivery." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2021.](https://mlanthology.org/ecmlpkdd/2021/mathew2021ecmlpkdd-defraudnet/) doi:10.1007/978-3-030-86514-6_6

BibTeX

@inproceedings{mathew2021ecmlpkdd-defraudnet,
  title     = {{DeFraudNet: An End-to-End Weak Supervision Framework to Detect Fraud in Online Food Delivery}},
  author    = {Mathew, Jose and Negi, Meghana and Vijjali, Rutvik and Sathyanarayana, Jairaj},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2021},
  pages     = {85-99},
  doi       = {10.1007/978-3-030-86514-6_6},
  url       = {https://mlanthology.org/ecmlpkdd/2021/mathew2021ecmlpkdd-defraudnet/}
}