ML Anthology
Authors
Search
About
Wataoka, Koki
3 publications
ICLRW
2024
Initial Response Selection for Prompt Jailbreaking Using Model Steering
Thien Q. Tran
,
Koki Wataoka
,
Tsubasa Takahashi
NeurIPSW
2024
Self-Preference Bias in LLM-as-a-Judge
Koki Wataoka
,
Tsubasa Takahashi
,
Ryokan Ri
NeurIPSW
2023
Verbosity Bias in Preference Labeling by Large Language Models
Keita Saito
,
Akifumi Wachi
,
Koki Wataoka
,
Youhei Akimoto