VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View

Cite

Text

Schumann et al. "VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I17.29858

Markdown

[Schumann et al. "VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/schumann2024aaai-velma/) doi:10.1609/AAAI.V38I17.29858

BibTeX

@inproceedings{schumann2024aaai-velma,
  title     = {{VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View}},
  author    = {Schumann, Raphael and Zhu, Wanrong and Feng, Weixi and Fu, Tsu-Jui and Riezler, Stefan and Wang, William Yang},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {18924-18933},
  doi       = {10.1609/AAAI.V38I17.29858},
  url       = {https://mlanthology.org/aaai/2024/schumann2024aaai-velma/}
}