Wataoka, Koki

3 publications

ICLRW 2024 Initial Response Selection for Prompt Jailbreaking Using Model Steering Thien Q. Tran, Koki Wataoka, Tsubasa Takahashi
NeurIPSW 2024 Self-Preference Bias in LLM-as-a-Judge Koki Wataoka, Tsubasa Takahashi, Ryokan Ri
NeurIPSW 2023 Verbosity Bias in Preference Labeling by Large Language Models Keita Saito, Akifumi Wachi, Koki Wataoka, Youhei Akimoto