An Ethical Dataset from Real-World Interactions Between Users and Large Language Models

Masahiro Kaneko, Danushka Bollegala, Timothy Baldwin

IJCAI 2025 pp. 9737-9745

doi:10.24963/IJCAI.2025/1082 /ijcai/2025/kaneko2025ijcai-ethical/

Abstract

Recent studies have demonstrated that Large Language Models (LLMs) have ethical-related problems such as social biases, lack of moral reasoning, and generation of offensive content. The existing evaluation metrics and methods to address these ethical challenges use datasets intentionally created by instructing humans to create instances including ethical problems. Therefore, the data does not sufficiently include comprehensive prompts that users actually provide when using LLM services in everyday contexts and outputs that LLMs generate. There may be different tendencies between unethical instances intentionally created by humans and actual user interactions with LLM services, which could result in a lack of comprehensive evaluation. To investigate the difference, we create Eagle datasets extracted from actual interactions between ChatGPT and users that exhibit social biases, opinion biases, toxicity, and immoral problems. Our experiments show that Eagle captures complementary aspects, not covered by existing datasets proposed for evaluation and mitigation. We argue that using both existing and proposed datasets leads to a more comprehensive assessment of the ethics.

PDF IJCAI Semantic Scholar

Cite

Text

Kaneko et al. "An Ethical Dataset from Real-World Interactions Between Users and Large Language Models." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/1082

Markdown

[Kaneko et al. "An Ethical Dataset from Real-World Interactions Between Users and Large Language Models." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/kaneko2025ijcai-ethical/) doi:10.24963/IJCAI.2025/1082

BibTeX

@inproceedings{kaneko2025ijcai-ethical,
  title     = {{An Ethical Dataset from Real-World Interactions Between Users and Large Language Models}},
  author    = {Kaneko, Masahiro and Bollegala, Danushka and Baldwin, Timothy},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {9737-9745},
  doi       = {10.24963/IJCAI.2025/1082},
  url       = {https://mlanthology.org/ijcai/2025/kaneko2025ijcai-ethical/}
}