Olvera, Michel

1 publications

NeurIPS 2024 An Eye for an Ear: Zero-Shot Audio Description Leveraging an Image Captioner with Audio-Visual Token Distribution Matching Hugo Malard, Michel Olvera, Stéphane Lathuiliere, Slim Essid