Kawaguchi, Kenji
87 publications
ICLR
2025
Learning Diverse Attacks on Large Language Models for Robust Red-Teaming and Safety Tuning
NeurIPS
2024
Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling
ECCV
2024
Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models
ICML
2024
Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers
NeurIPSW
2024
Learning Diverse Attacks on Large Language Models for Robust Red-Teaming and Safety Tuning
NeurIPS
2024
Stochastic Taylor Derivative Estimator: Efficient Amortization for Arbitrary Differential Operators
NeurIPS
2023
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks
NeurIPS
2022
Discrete Compositional Representations as an Abstraction for Goal Conditioned Reinforcement Learning
ICML
2021
Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth
NeurIPS
2021
Tailoring: Encoding Inductive Biases by Optimizing Unsupervised Objectives at Prediction Time