FairMOE: Counterfactually-Fair Mixture of Experts with Levels of Interpretability
Abstract
With the rise of artificial intelligence in our everyday lives, the need for human interpretation of machine learning models’ predictions emerges as a critical issue. Generally, interpretability is viewed as a binary notion with a performance trade-off. Either a model is fully-interpretable but lacks the ability to capture more complex patterns in the data, or it is a black box. In this paper, we argue that this view is severely limiting and that instead interpretability should be viewed as a continuous domain-informed concept. We leverage the well-known Mixture of Experts architecture with user-defined limits on non-interpretability. We extend this idea with a counterfactual fairness module to ensure the selection of consistently fair experts: FairMOE. We perform an extensive experimental evaluation with fairness-related data sets and compare our proposal against state-of-the-art methods. Our results demonstrate that FairMOE is competitive with the leading fairness-aware algorithms in both fairness and predictive measures while providing more consistent performance, competitive scalability, and, most importantly, greater interpretability.
Cite
Text
Germino et al. "FairMOE: Counterfactually-Fair Mixture of Experts with Levels of Interpretability." Machine Learning, 2024. doi:10.1007/S10994-024-06583-2Markdown
[Germino et al. "FairMOE: Counterfactually-Fair Mixture of Experts with Levels of Interpretability." Machine Learning, 2024.](https://mlanthology.org/mlj/2024/germino2024mlj-fairmoe/) doi:10.1007/S10994-024-06583-2BibTeX
@article{germino2024mlj-fairmoe,
title = {{FairMOE: Counterfactually-Fair Mixture of Experts with Levels of Interpretability}},
author = {Germino, Joe and Moniz, Nuno and Chawla, Nitesh V.},
journal = {Machine Learning},
year = {2024},
pages = {6539-6559},
doi = {10.1007/S10994-024-06583-2},
volume = {113},
url = {https://mlanthology.org/mlj/2024/germino2024mlj-fairmoe/}
}