Language-Enhanced RNR-mAP: Querying Renderable Neural Radiance Field Maps with Natural Language
Abstract
We present Le-RNR-Map, a Language-enhanced Renderable Neural Radiance map for Visual Navigation with natural language query prompts. The recently proposed RNR-Map employs a grid structure comprising latent codes positioned at each pixel. These latent codes, which are derived from image observation, enable: i) image rendering given a camera pose, since they are converted to Neural Radiance Field; ii) image navigation and localization with astonishing accuracy. On top of this, we enhance RNR-Map with CLIP-based embedding latent codes, allowing natural language search without additional label data. We evaluate the effectiveness of this map in single and multi-object searches. We also investigate its compatibility with a Large Language Model as an "affordance query resolver". Code and videos are available at the link https://intelligolabs.github.io/Le-RNR-Map/.
Cite
Text
Taioli et al. "Language-Enhanced RNR-mAP: Querying Renderable Neural Radiance Field Maps with Natural Language." IEEE/CVF International Conference on Computer Vision Workshops, 2023. doi:10.1109/ICCVW60793.2023.00504Markdown
[Taioli et al. "Language-Enhanced RNR-mAP: Querying Renderable Neural Radiance Field Maps with Natural Language." IEEE/CVF International Conference on Computer Vision Workshops, 2023.](https://mlanthology.org/iccvw/2023/taioli2023iccvw-languageenhanced/) doi:10.1109/ICCVW60793.2023.00504BibTeX
@inproceedings{taioli2023iccvw-languageenhanced,
title = {{Language-Enhanced RNR-mAP: Querying Renderable Neural Radiance Field Maps with Natural Language}},
author = {Taioli, Francesco and Cunico, Federico and Girella, Federico and Bologna, Riccardo and Farinelli, Alessandro and Cristani, Marco},
booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
year = {2023},
pages = {4671-4676},
doi = {10.1109/ICCVW60793.2023.00504},
url = {https://mlanthology.org/iccvw/2023/taioli2023iccvw-languageenhanced/}
}