Interpretations of Predictive Models for Lifestyle-Related Diseases at Multiple Time Intervals
Abstract
Health screening is practiced in many countries to find asymptotic patients of diseases. There is a possibility that applying machine learning to health screening datasets enables predicting future medical conditions. We extend this approach by introducing interpretable machine learning and determining health screening items (attributes) that contribute to detecting lifestyle-related diseases in their early stages. Furthermore, we determine how contributing attributes change within one to four years of time. We target diabetes and chronic kidney disease (CKD), which are among the most common lifestyle-related diseases. We trained predictive models using XGBoost and estimated each attribute’s contribution levels using SHapley Additive exPlanations (SHAP). The results indicated that numerous attributes drastically change their levels of contribution over time. Many of the results matched our medical knowledge, but we also obtained unexpected outcomes. For example, we found that for predicting HbA1c and creatinine, which are indicators of diabetes and CKD, respectively, the contribution from alanine transaminase goes up as the time interval lengthens. Such findings can provide insights into the underlying mechanisms of how lifestyle-related diseases aggravate.
Cite
Text
Oba et al. "Interpretations of Predictive Models for Lifestyle-Related Diseases at Multiple Time Intervals." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2022. doi:10.1007/978-3-031-26387-3_18Markdown
[Oba et al. "Interpretations of Predictive Models for Lifestyle-Related Diseases at Multiple Time Intervals." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2022.](https://mlanthology.org/ecmlpkdd/2022/oba2022ecmlpkdd-interpretations/) doi:10.1007/978-3-031-26387-3_18BibTeX
@inproceedings{oba2022ecmlpkdd-interpretations,
title = {{Interpretations of Predictive Models for Lifestyle-Related Diseases at Multiple Time Intervals}},
author = {Oba, Yuki and Tezuka, Taro and Sanuki, Masaru and Wagatsuma, Yukiko},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2022},
pages = {293-308},
doi = {10.1007/978-3-031-26387-3_18},
url = {https://mlanthology.org/ecmlpkdd/2022/oba2022ecmlpkdd-interpretations/}
}