PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing

Xie, Yiping; Zhao, Bo; Dai, Mingtong; Zhou, Jian-Ping; Sun, Yue; Tan, Tao; Xie, Weicheng; Shen, Linlin; Yu, Zitong

PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing

Yiping Xie, Bo Zhao, Mingtong Dai, Jian-Ping Zhou, Yue Sun, Tao Tan, Weicheng Xie, Linlin Shen, Zitong Yu

ICLR 2026

/iclr/2026/xie2026iclr-physllm/

Abstract

Remote photoplethysmography (rPPG) enables non-contact physiological measurement but remains highly susceptible to illumination changes, motion artifacts, and limited temporal modeling. Large Language Models (LLMs) excel at capturing long-range dependencies, offering a potential solution but struggle with the continuous, noise-sensitive nature of rPPG signals due to their text-centric design. To bridge this gap, we introduce PhysLLM, a collaborative optimization framework that synergizes LLMs with domain-specific rPPG components. Specifically, the Text Prototype Guidance (TPG) strategy is proposed to establish cross-modal alignment by projecting hemodynamic features into LLM-interpretable semantic space, effectively bridging the representational gap between physiological signals and linguistic tokens. Besides, a novel Dual-Domain Stationary (DDS) Algorithm is proposed for resolving signal instability through adaptive time-frequency domain feature re-weighting. Finally, rPPG task-specific cues systematically inject physiological priors through physiological statistics, environmental contextual answering, and task description, leveraging cross-modal learning to integrate both visual and textual information, enabling dynamic adaptation to challenging scenarios like variable illumination and subject movements. Evaluation on four benchmark datasets, PhysLLM achieves state-of-the-art accuracy and robustness, demonstrating superior generalization across lighting variations and motion scenarios.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Xie et al. "PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing." International Conference on Learning Representations, 2026.

Markdown

[Xie et al. "PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/xie2026iclr-physllm/)

BibTeX

@inproceedings{xie2026iclr-physllm,
  title     = {{PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing}},
  author    = {Xie, Yiping and Zhao, Bo and Dai, Mingtong and Zhou, Jian-Ping and Sun, Yue and Tan, Tao and Xie, Weicheng and Shen, Linlin and Yu, Zitong},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/xie2026iclr-physllm/}
}