Longitudinal Surveys Are Texts: LLM-Enhanced Analysis of School Attendance in New Zealand

Abstract

School attendance is an important factor in educational success and plays a key role in shaping students’ academic and social development. Longitudinal surveys provide valuable insights into factors affecting attendance patterns, yet analysing such data presents unique challenges. First, the variation in survey questions across data collection waves complicates the application of standard temporal modelling techniques that assume consistent features over time. Second, conventional methods often one-hot encode survey responses, stripping away contextual meaning within questions and responses. Lastly, open-ended responses are typically omitted, leading to a loss of valuable qualitative insights. To address these challenges, we propose Survey-as-Text Modelling (STM), which represents multi-wave survey questionnaires as coherent textual sequences. By maintaining the textual format, STM allows similar questions across different years to be compared directly rather than existing as independent features. STM also retains the meaning within question-response pairs, preventing loss of information from one-hot encoding and enabling the incorporation of open-ended responses. We apply STM to survey data from Growing Up in New Zealand and link it to official attendance records from the New Zealand Ministry of Education . We leverage large language models (LLMs) to predict future school attendance from text-based surveys, outperforming existing temporal methods. Beyond predictive accuracy, we propose gradient-guided counterfactual analysis to identify key survey questions influencing the model’s decision-making. Our findings highlight the potential of LLMs for survey analysis and provide data-driven insights that can inform policy and intervention strategies.

Cite

Text

Qiao et al. "Longitudinal Surveys Are Texts: LLM-Enhanced Analysis of School Attendance in New Zealand." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025. doi:10.1007/978-3-662-72243-5_18

Markdown

[Qiao et al. "Longitudinal Surveys Are Texts: LLM-Enhanced Analysis of School Attendance in New Zealand." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025.](https://mlanthology.org/ecmlpkdd/2025/qiao2025ecmlpkdd-longitudinal/) doi:10.1007/978-3-662-72243-5_18

BibTeX

@inproceedings{qiao2025ecmlpkdd-longitudinal,
  title     = {{Longitudinal Surveys Are Texts: LLM-Enhanced Analysis of School Attendance in New Zealand}},
  author    = {Qiao, Tingrui and Walker, Caroline and Cunningham, Chris and Jang-Jones, Adam and Morton, Susan M. B. and Meissel, Kane and Koh, Yun Sing},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2025},
  pages     = {310-327},
  doi       = {10.1007/978-3-662-72243-5_18},
  url       = {https://mlanthology.org/ecmlpkdd/2025/qiao2025ecmlpkdd-longitudinal/}
}