LiLMaps: Learnable Implicit Language Maps

Abstract

One of the current trends in robotics is to employ large language models (LLMs) to provide non-predefined commands execution and natural human-robot interaction. It is useful to have an environment map together with its language representation which can be further utilized by LLMs. Such a comprehensive scene representation enables numerous ways of interaction with the map for autonomously operating robots. In this work we present an approach that enhances incremental implicit mapping through the integration of visual-language features. Specifically we (i) propose a decoder optimization technique for implicit language maps which can be used when new objects appear on the scene and (ii) address the problem of inconsistent visual-language predictions between different viewing positions. Our experiments demonstrate the effectiveness of LiLMaps and solid improvements in performance.

Cite

Text

Kruzhkov and Behnke. "LiLMaps: Learnable Implicit Language Maps." Winter Conference on Applications of Computer Vision, 2025.

Markdown

[Kruzhkov and Behnke. "LiLMaps: Learnable Implicit Language Maps." Winter Conference on Applications of Computer Vision, 2025.](https://mlanthology.org/wacv/2025/kruzhkov2025wacv-lilmaps/)

BibTeX

@inproceedings{kruzhkov2025wacv-lilmaps,
  title     = {{LiLMaps: Learnable Implicit Language Maps}},
  author    = {Kruzhkov, Evgenii and Behnke, Sven},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2025},
  pages     = {7700-7709},
  url       = {https://mlanthology.org/wacv/2025/kruzhkov2025wacv-lilmaps/}
}