Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images

Abstract

We introduce Fish-Visual Trait Analysis (Fish-Vista), the first organismal image dataset designed for the analysis of visual traits of aquatic species directly from images using machine learning and computer vision methods. Fish-Vista contains 69,269 annotated images spanning 4,316 fish species, curated and organized to serve three downstream tasks: species classification, trait identification, and trait segmentation. Our work makes two key contributions. First, we provide a fully reproducible data processing pipeline to process fish images sourced from various museum collections, contributing to the advancement of AI in biodiversity science. We annotate the images with carefully curated labels from biological databases and manual annotations to create an AI-ready dataset of visual traits. Second, our work offers fertile grounds for researchers to develop novel methods for a variety of problems in computer vision such as handling long-tailed distributions, out-of-distribution generalization, learning with weak labels, explainable AI, and segmenting small objects. Dataset and code for Fish-Vista are available at https://github.com/Imageomics/Fish-Vista

Cite

Text

Mehrab et al. "Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images." Conference on Computer Vision and Pattern Recognition, 2025.

Markdown

[Mehrab et al. "Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/mehrab2025cvpr-fishvista/)

BibTeX

@inproceedings{mehrab2025cvpr-fishvista,
  title     = {{Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images}},
  author    = {Mehrab, Kazi Sajeed and Maruf, M. and Daw, Arka and Neog, Abhilash and Manogaran, Harish Babu and Khurana, Mridul and Feng, Zhenyang and Altintas, Bahadir and Bakis, Yasin and Campolongo, Elizabeth G and Thompson, Matthew J and Wang, Xiaojun and Lapp, Hilmar and Berger-Wolf, Tanya and Mabee, Paula and Bart, Henry and Chao, Wei-Lun and Dahdul, Wasila M and Karpatne, Anuj},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {24275-24285},
  url       = {https://mlanthology.org/cvpr/2025/mehrab2025cvpr-fishvista/}
}