Bianchi, Federico

10 publications

ICLR 2025 H4rm3l: A Language for Composable Jailbreak Attack Synthesis Moussa Koulako Bala Doumbouya, Ananjan Nandi, Gabriel Poesia, Davide Ghilardi, Anna Goldie, Federico Bianchi, Dan Jurafsky, Christopher D Manning
JAIR 2025 Scaling Safe Policy Improvement: Monte Carlo Tree Search and Policy Iteration Strategies Federico Bianchi, Alberto Castellini, Edoardo Zorzi, Thiago D. Simão, Matthijs T. J. Spaan, Alessandro Farinelli
NeurIPSW 2024 AI-Generated Content and Public Persuasion: The Limited Effect of AI Authorship Labels Isabel O. Gallegos, Chen Shani, Weiyan Shi, Federico Bianchi, Robb Willer, Dan Jurafsky
ICML 2024 How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis Federico Bianchi, Patrick John Chia, Mert Yuksekgonul, Jacopo Tagliabue, Dan Jurafsky, James Zou
ICLR 2024 Safety-Tuned LLaMAs: Lessons from Improving the Safety of Large Language Models That Follow Instructions Federico Bianchi, Mirac Suzgun, Giuseppe Attanasio, Paul Rottger, Dan Jurafsky, Tatsunori Hashimoto, James Zou
ICML 2024 Scalable Safe Policy Improvement for Factored Multi-Agent MDPs Federico Bianchi, Edoardo Zorzi, Alberto Castellini, Thiago D. Simão, Matthijs T. J. Spaan, Alessandro Farinelli
ICML 2023 Scalable Safe Policy Improvement via Monte Carlo Tree Search Alberto Castellini, Federico Bianchi, Edoardo Zorzi, Thiago D. Simão, Alessandro Farinelli, Matthijs T. J. Spaan
JAIR 2023 Viewpoint: Artificial Intelligence Accidents Waiting to Happen? Federico Bianchi, Amanda Cercas Curry, Dirk Hovy
ICLR 2023 When and Why Vision-Language Models Behave like Bags-of-Words, and What to Do About It? Mert Yuksekgonul, Federico Bianchi, Pratyusha Kalluri, Dan Jurafsky, James Zou
AAAI 2019 Training Temporal Word Embeddings with a Compass Valerio Di Carlo, Federico Bianchi, Matteo Palmonari