Ten Words Only Still Help: Improving Black-Box AI-Generated Text Detection via Proxy-Guided Efficient Re-Sampling
Abstract
Vision-language models (VLMs) have demonstrated strong zero-shot inference capabilities but may exhibit stereotypical biases toward certain demographic groups. Consequently, downstream tasks leveraging these models may yield unbalanced performance across different target social groups, potentially reinforcing harmful stereotypes. Mitigating such biases is critical for ensuring fairness in practical applications. Existing debiasing approaches typically rely on curated face-centric datasets for fine-tuning or retraining, risking overfitting and limiting generalisability. To address this issue, we propose a novel framework, CABIN (Causal Adjustment Based INtervention). It leverages a causal framework to identify sensitive attributes in images as confounding factors. Employing a learned mapper, which is trained on general large-scale image-text pairs rather than face-centric datasets, CABIN may use text to adjust sensitive attributes in the image embedding, ensuring independence between these sensitive attributes and image embeddings. This independence enables a backdoor adjustment for unbiased inference without the drawbacks of additional fine-tuning or retraining on narrowly tailored datasets. Through comprehensive experiments and analyses, we demonstrate that CABIN effectively mitigates biases and improves fairness metrics while preserving the zero-shot strengths of VLMs. The code is available at: https://github.com/ipangbo/causal-debias
Cite
Text
Shi et al. "Ten Words Only Still Help: Improving Black-Box AI-Generated Text Detection via Proxy-Guided Efficient Re-Sampling." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/55Markdown
[Shi et al. "Ten Words Only Still Help: Improving Black-Box AI-Generated Text Detection via Proxy-Guided Efficient Re-Sampling." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/shi2024ijcai-ten/) doi:10.24963/ijcai.2024/55BibTeX
@inproceedings{shi2024ijcai-ten,
title = {{Ten Words Only Still Help: Improving Black-Box AI-Generated Text Detection via Proxy-Guided Efficient Re-Sampling}},
author = {Shi, Yuhui and Sheng, Qiang and Cao, Juan and Mi, Hao and Hu, Beizhe and Wang, Danding},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2024},
pages = {494-502},
doi = {10.24963/ijcai.2024/55},
url = {https://mlanthology.org/ijcai/2024/shi2024ijcai-ten/}
}