Parisien, Christopher

2 publications

NeurIPSW 2024 AEGIS2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails Shaona Ghosh, Prasoon Varshney, Makesh Narsimhan Sreedhar, Aishwarya Padmakumar, Traian Rebedea, Jibin Rajan Varghese, Christopher Parisien
NeurIPSW 2024 Towards Inference-Time Category-Wise Safety Steering for Large Language Models Amrita Bhattacharjee, Shaona Ghosh, Traian Rebedea, Christopher Parisien