X-Capture: An Open-Source Portable Device for Multi-Sensory Learning

Abstract

Understanding objects through multiple sensory modalities is fundamental to human perception, enabling cross-sensory integration and richer comprehension. For AI and robotic systems to replicate this ability, access to diverse, high-quality multi-sensory data is critical. Existing datasets are often limited by their focus on controlled environments, simulated objects, or restricted modality pairings. We introduce X-Capture, an open-source, portable, and cost-effective device for real-world multi-sensory data collection, capable of capturing correlated RGBD images, tactile readings, and impact audio. With a build cost under 1,000, X-Capture democratizes the creation of multi-sensory datasets, requiring only consumer-grade tools for assembly. Using X-Capture, we curate a sample dataset of 3,600 total points on 600 everyday objects from diverse, real-world environments, offering both richness and variety. Our experiments demonstrate the value of both the quantity and the sensory breadth of our data for both pretraining and fine-tuning multi-modal representations for object-centric tasks such as cross-sensory retrieval and reconstruction. X-Capture lays the groundwork for advancing human-like sensory representations in AI, emphasizing scalability, accessibility, and real-world applicability.

Cite

Text

Clarke et al. "X-Capture: An Open-Source Portable Device for Multi-Sensory Learning." International Conference on Computer Vision, 2025.

Markdown

[Clarke et al. "X-Capture: An Open-Source Portable Device for Multi-Sensory Learning." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/clarke2025iccv-xcapture/)

BibTeX

@inproceedings{clarke2025iccv-xcapture,
  title     = {{X-Capture: An Open-Source Portable Device for Multi-Sensory Learning}},
  author    = {Clarke, Samuel and Wistreich, Suzannah and Ze, Yanjie and Wu, Jiajun},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {6436-6446},
  url       = {https://mlanthology.org/iccv/2025/clarke2025iccv-xcapture/}
}