Boreiko, Valentyn

8 publications

ICML 2025 An Interpretable N-Gram Perplexity Threat Model for Large Language Model Jailbreaks Valentyn Boreiko, Alexander Panfilov, Vaclav Voracek, Matthias Hein, Jonas Geiping
ICML 2025 How Much Can We Forget About Data Contamination? Sebastian Bordt, Suraj Srinivas, Valentyn Boreiko, Ulrike Von Luxburg
NeurIPSW 2024 A Realistic Threat Model for Large Language Model Jailbreaks Valentyn Boreiko, Alexander Panfilov, Vaclav Voracek, Matthias Hein, Jonas Geiping
ICCV 2023 Identification of Systematic Errors of Image Classifiers on Rare Subgroups Jan Hendrik Metzen, Robin Hutmacher, N. Grace Hua, Valentyn Boreiko, Dan Zhang
ICCVW 2023 Identifying Systematic Errors in Object Detectors with the SCROD Pipeline Valentyn Boreiko, Matthias Hein, Jan Hendrik Metzen
ICCV 2023 Spurious Features Everywhere - Large-Scale Detection of Harmful Spurious Features in ImageNet Yannic Neuhaus, Maximilian Augustin, Valentyn Boreiko, Matthias Hein
ICMLW 2022 Classifiers Should Do Well Even on Their Worst Classes Julian Bitterwolf, Alexander Meinke, Valentyn Boreiko, Matthias Hein
NeurIPS 2022 Diffusion Visual Counterfactual Explanations Maximilian Augustin, Valentyn Boreiko, Francesco Croce, Matthias Hein