Wang, Tian
11 publications
ICLR
2026
Detecting Data Contamination from Reinforcement Learning Post-Training for Large Language Models
ICLR
2026
SFT Doesn’t Always Hurt General Capabilities: Revisiting Domain-Specific Fine-Tuning in LLMs
ICLR
2026
Shop-R1: Rewarding LLMs to Simulate Human Behavior in Online Shopping via Reinforcement Learning
CVPR
2023
Few-Shot Learning with Visual Distribution Calibration and Cross-Modal Distribution Alignment