Bereska, Leonard

3 publications

TMLR 2025 Superposition as Lossy Compression — Measure with Sparse Autoencoders and Connect to Adversarial Vulnerability Leonard Bereska, Zoe Tzifa-Kratira, Reza Samavi, Stratis Gavves
TMLR 2024 Mechanistic Interpretability for AI Safety - A Review Leonard Bereska, Stratis Gavves
CoLLAs 2022 Continual Learning of Dynamical Systems with Competitive Federated Reservoir Computing Leonard Bereska, Efstratios Gavves