Do, Van Dai

1 publications

TMLR 2025 Reasoning Under 1 Billion: Memory-Augmented Reinforcement Learning for Large Language Models Hung Le, Van Dai Do, Dung Nguyen, Svetha Venkatesh