Procedure Knowledge Decoupled Distillation Strategy for Procedure Planning in Instructional Videos

Abstract

Procedure planning in instructional videos, producing a structured and plannable action sequence facilitating the transition from the start to the goal states, has achieved significant progress. The dominant single-branch non-autoregressive planning paradigm guides action sequence generation through action labels, overlooking the limitation of the absence of intermediate visual information. Hence, we introduce the procedure knowledge decoupled distillation strategy to address the above issue. This innovative strategy deliberately lets the teacher model see the real visual information among the start and goal states to enhance its action semantic understanding and relationship modeling ability, producing the potential probability distribution containing the real action class and other action classes that may occur. Accordingly, we introduce a decoupled intermediate information knowledge distillation loss, which comprises single action knowledge distillation and sequence distribution knowledge distillation for the student model. The former improves the student model's precise inference ability for individual actions by transferring knowledge of a single action target category using binary classification loss. Conversely, the latter uses MSE loss to constrain the student model to learn the action sequence probability distribution from the teacher model, thereby enhancing the student model's global planning capability. Extensive experiments on three datasets demonstrate that our strategy can improve the performance of multiple weakly supervised models, achieving promising procedure knowledge modeling ability and plug-and-play flexibility.

Cite

Text

Pan et al. "Procedure Knowledge Decoupled Distillation Strategy for Procedure Planning in Instructional Videos." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I6.32677

Markdown

[Pan et al. "Procedure Knowledge Decoupled Distillation Strategy for Procedure Planning in Instructional Videos." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/pan2025aaai-procedure/) doi:10.1609/AAAI.V39I6.32677

BibTeX

@inproceedings{pan2025aaai-procedure,
  title     = {{Procedure Knowledge Decoupled Distillation Strategy for Procedure Planning in Instructional Videos}},
  author    = {Pan, Xiaotian and Qi, Zhaobo and Sun, Xin and Xu, Yuanrong and Zhang, Weigang},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {6326-6334},
  doi       = {10.1609/AAAI.V39I6.32677},
  url       = {https://mlanthology.org/aaai/2025/pan2025aaai-procedure/}
}