TextToucher: Fine-Grained Text-to-Touch Generation

Abstract

Tactile sensation plays a crucial role in the development of multi-modal large models and embodied intelligence. To collect tactile data with minimal cost as possible, a series of studies have attempted to generate tactile images by vision-to-touch image translation. However, compared to text modality, visual modality-driven tactile generation cannot accurately depict human tactile sensation. In this work, we analyze the characteristics of tactile images in detail from two granularities: object-level (tactile texture, tactile shape), and sensor-level (gel status). We model these granularities of information through text descriptions and propose a fine-grained Text-to-Touch generation method (TextToucher) to generate high-quality tactile samples. Specifically, we introduce a multimodal large language model to build the text sentences about object-level tactile information and employ a set of learnable text prompts to represent the sensor-level tactile information. To better guide the tactile generation process with the built text information, we fuse the dual grains of text information and explore various dual-grain text conditioning methods within the diffusion transformer architecture. Furthermore, we propose a Contrastive Text-Touch Pre-training (CTTP) metric to precisely evaluate the quality of text-driven generated tactile data. Extensive experiments demonstrate the superiority of our TextToucher method.

Cite

Text

Tu et al. "TextToucher: Fine-Grained Text-to-Touch Generation." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I7.32802

Markdown

[Tu et al. "TextToucher: Fine-Grained Text-to-Touch Generation." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/tu2025aaai-texttoucher/) doi:10.1609/AAAI.V39I7.32802

BibTeX

@inproceedings{tu2025aaai-texttoucher,
  title     = {{TextToucher: Fine-Grained Text-to-Touch Generation}},
  author    = {Tu, Jiahang and Fu, Hao and Yang, Fengyu and Zhao, Hanbin and Zhang, Chao and Qian, Hui},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {7455-7463},
  doi       = {10.1609/AAAI.V39I7.32802},
  url       = {https://mlanthology.org/aaai/2025/tu2025aaai-texttoucher/}
}