Mikulik, Vladimir

3 publications

ICLR 2026 Constitutional Classifiers++: Efficient Production-Grade Defenses Against Universal Jailbreaks Hoagy Cunningham, Jerry Wei, Zihan Wang, Andrew Persic, Alwin Peng, Jordan Abderrachid, Raj Agarwal, Bobby Chen, Andy Dau, Alek Dimitriev, Logan Howard, Yijin Hua, Rob Gilson, Mu Lin, Christopher Liu, Vladimir Mikulik, Rohit Mittapalli, Clare O'Hara, Jin Pan, Nikhil Saxena, Alex Silverstein, Yue Song, Giulio Zhou, Jan Leike, Jared Kaplan, Ethan Perez, Mrinank Sharma
NeurIPS 2023 Tracr: Compiled Transformers as a Laboratory for Interpretability David Lindner, Janos Kramar, Sebastian Farquhar, Matthew Rahtz, Tom McGrath, Vladimir Mikulik
NeurIPS 2020 Meta-Trained Agents Implement Bayes-Optimal Agents Vladimir Mikulik, Grégoire Delétang, Tom McGrath, Tim Genewein, Miljan Martic, Shane Legg, Pedro Ortega