ITFormer: Bridging Time Series and Natural Language for Multi-Modal QA with Large-Scale Multitask Dataset
Abstract
Time-series data are critical in diverse applications, such as industrial monitoring, medical diagnostics, and climate research. However, effectively integrating these high-dimensional temporal signals with natural language for dynamic, interactive tasks remains a significant challenge. To address this, we introduce the Time-Series Question Answering (Time-Series QA) task and release EngineMT-QA, the first large-scale, multi-task, temporal-textual QA dataset designed to capture complex interactions between time-series signals and natural language. Building on this resource, we propose the Instruct Time Transformer (ITFormer), a novel framework that bridges time-series encoders with frozen large language models (LLMs). ITFormer effectively extracts, aligns, and fuses temporal and textual features, achieving a strong improvement in QA accuracy over strong baselines with fewer than 1% additional trainable parameters. By combining computational efficiency with robust cross-modal modeling, our work establishes a adaptable paradigm for integrating temporal data with natural language, paving the way for new research and applications in multi-modal AI. More details about the project, including datasets and code, are available at: https://pandalin98.github.io/itformer_site/.
Cite
Text
Wang et al. "ITFormer: Bridging Time Series and Natural Language for Multi-Modal QA with Large-Scale Multitask Dataset." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Wang et al. "ITFormer: Bridging Time Series and Natural Language for Multi-Modal QA with Large-Scale Multitask Dataset." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/wang2025icml-itformer/)BibTeX
@inproceedings{wang2025icml-itformer,
title = {{ITFormer: Bridging Time Series and Natural Language for Multi-Modal QA with Large-Scale Multitask Dataset}},
author = {Wang, Yilin and Lei, Peixuan and Song, Jie and Hao, Yuzhe and Chen, Tao and Zhang, Yuxuan and Jia, Lei and Li, Yuanxiang and Wei, Zhongyu},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {63324-63344},
volume = {267},
url = {https://mlanthology.org/icml/2025/wang2025icml-itformer/}
}