ML Anthology
Authors
Search
About
Mu, Tong
6 publications
ICMLW
2024
Rule Based Rewards for Fine-Grained LLM Safety
Tong Mu
,
Alec Helyar
,
Johannes Heidecke
,
Joshua Achiam
,
Andrea Vallone
,
Ian D Kivlichan
,
Molly Lin
,
Alex Beutel
,
John Schulman
,
Lilian Weng
NeurIPS
2024
Rule Based Rewards for Language Model Safety
Tong Mu
,
Alec Helyar
,
Johannes Heidecke
,
Joshua Achiam
,
Andrea Vallone
,
Ian Kivlichan
,
Molly Lin
,
Alex Beutel
,
John Schulman
,
Lilian Weng
ICML
2023
Simple Embodied Language Learning as a Byproduct of Meta-Reinforcement Learning
Evan Zheran Liu
,
Sahaana Suri
,
Tong Mu
,
Allan Zhou
,
Chelsea Finn
AAAI
2022
Constraint Sampling Reinforcement Learning: Incorporating Expertise for Faster Learning
Tong Mu
,
Georgios Theocharous
,
David Arbour
,
Emma Brunskill
NeurIPS
2022
Factored DRO: Factored Distributionally Robust Policies for Contextual Bandits
Tong Mu
,
Yash Chandak
,
Tatsunori B Hashimoto
,
Emma Brunskill
TMLR
2022
Modeling Bounded Rationality in Multi-Agent Simulations Using Rationally Inattentive Reinforcement Learning
Tong Mu
,
Stephan Zheng
,
Alexander R Trott