Transformer Designs for In-Context Learning in Foundation Models for Time Series Forecasting with Covariates
Abstract
Recent foundation models (FMs) for time series forecasting (TSF) have shown promising results in zero-shot generalization to new series but are incapable of modeling series-specific dependence on covariates. We identify that historical values in TSF implicitly provide labeled data, which can be leveraged for in-context learning (ICL). While transformers have demonstrated ICL capabilities for regression tasks, their effectiveness as FMs depends on tokenization, attention type, and loss function placement during pre-training. We study three existing tokenization schemes and propose a modified shifted causal attention for faster convergence and effective ICL. This approach combines covariates and the target, enabling linear regression in a single layer. Our theoretical analysis shows that the popular method of patching the input series is suboptimal for ICL on time series with covariates.
Cite
Text
Dange et al. "Transformer Designs for In-Context Learning in Foundation Models for Time Series Forecasting with Covariates." ICML 2024 Workshops: TF2M, 2024.Markdown
[Dange et al. "Transformer Designs for In-Context Learning in Foundation Models for Time Series Forecasting with Covariates." ICML 2024 Workshops: TF2M, 2024.](https://mlanthology.org/icmlw/2024/dange2024icmlw-transformer/)BibTeX
@inproceedings{dange2024icmlw-transformer,
title = {{Transformer Designs for In-Context Learning in Foundation Models for Time Series Forecasting with Covariates}},
author = {Dange, Afrin and Raj, Vaibhav and Netrapalli, Praneeth and Sarawagi, Sunita},
booktitle = {ICML 2024 Workshops: TF2M},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/dange2024icmlw-transformer/}
}