Han, Xu
49 publications
NeurIPS
2025
A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings
ICLR
2025
Divergence-Enhanced Knowledge-Guided Context Optimization for Visual-Language Prompt Tuning
NeurIPS
2024
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models
NeurIPS
2024
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
ICMLW
2024
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
NeurIPS
2023
H3T: Efficient Integration of Memory Optimization and Parallelism for Large-Scale Transformer Training