Towards Interpretable and Robust UAV-Based Foundation Model for Endangered Species Monitoring in Complex Ecosystems

Abstract

Recent advancements in foundation models have significantly enhanced the robustness and scalability of traditional methods in a variety of domains. However, their application to specialized ecological environments, where challenges such as data scarcity, camouflage, and environmental noise persist, remains an area requiring further exploration. This study investigates the application of foundation models in species monitoring within complex ecological systems, with a focus on juvenile Tri-spine horseshoe crabs ( Tachypleus tridentatus ) in Hong Kong’s intertidal zones. Traditional methods for monitoring these endangered species are labor-intensive, imprecise, and disruptive to fragile ecosystems, particularly in environments where juveniles exhibit excellent camouflage and small-scale behavioral markers. Unmanned aerial vehicles (UAVs) offer a promising solution, yet their use in these settings is hampered by tidal movements, water turbidity, and complex backgrounds. To address these challenges, we apply a foundation model, Segment Anything Model 2 (SAM2), to UAV-based high-resolution imagery. By leveraging expert knowledge to design and extract domain-specific features, we fine-tune SAM2 using a few-shot learning strategy, enhancing its ability to accurately segment foraging trails with limited data. The fine-tuned model incorporates interpretable morphological features, such as trail length, width, and continuity, to distinguish biological trails from environmental noise, thereby improving both model robustness and interpretability. This approach demonstrates the efficacy of adapting foundation models for domain-specific challenges, advancing both the interpretability and reliability of ecological monitoring systems. The resulting species distribution maps provide valuable insights into population patterns, offering a scalable and transferable solution for monitoring endangered species in dynamic, data-scarce environments. This research highlights the potential of foundation models to revolutionize ecological monitoring by improving model trustworthiness and extending their application to complex, real-world problems.

Cite

Text

Zhang et al. "Towards Interpretable and Robust UAV-Based Foundation Model for Endangered Species Monitoring in Complex Ecosystems." Machine Learning, 2025. doi:10.1007/S10994-025-06787-0

Markdown

[Zhang et al. "Towards Interpretable and Robust UAV-Based Foundation Model for Endangered Species Monitoring in Complex Ecosystems." Machine Learning, 2025.](https://mlanthology.org/mlj/2025/zhang2025mlj-interpretable/) doi:10.1007/S10994-025-06787-0

BibTeX

@article{zhang2025mlj-interpretable,
  title     = {{Towards Interpretable and Robust UAV-Based Foundation Model for Endangered Species Monitoring in Complex Ecosystems}},
  author    = {Zhang, Jihan and Han, Mingqiao and Laurie, K. H. and Zhao, Benyun and Lei, Lei and Chen, Xi and Wan, Hon Chi Judy and Cheung, Siu Gin and Hong, Wenxing and Chen, Ben M.},
  journal   = {Machine Learning},
  year      = {2025},
  pages     = {158},
  doi       = {10.1007/S10994-025-06787-0},
  volume    = {114},
  url       = {https://mlanthology.org/mlj/2025/zhang2025mlj-interpretable/}
}