Karatzas, Dimosthenis

31 publications

ICLR 2025 DocMIA: Document-Level Membership Inference Attacks Against DocVQA Models Khanh Nguyen, Raouf Kerkouche, Mario Fritz, Dimosthenis Karatzas
ICML 2025 DocVXQA: Context-Aware Visual Explanations for Document Question Answering Mohamed Ali Souibgui, Changkyu Choi, Andrey Barsky, Kangsoo Jung, Ernest Valveny, Dimosthenis Karatzas
TMLR 2025 NeurIPS 2023 Competition: Privacy Preserving Federated Learning Document VQA Marlon Tobaben, Mohamed Ali Souibgui, Rubèn Tito, Khanh Nguyen, Raouf Kerkouche, Kangsoo Jung, Joonas Jälkö, Lei Kang, Andrey Barsky, Vincent Poulain d'Andecy, Aurélie Joseph, Aashiq Muhamed, Kevin Kuo, Virginia Smith, Yusuke Yamasaki, Takumi Fukami, Kenta Niwa, Iifan Tyou, Hiro Ishii, Rio Yokota, Ragul N, Rintu Kutum, Josep Llados, Ernest Valveny, Antti Honkela, Mario Fritz, Dimosthenis Karatzas
NeurIPS 2024 CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding Emanuele Vivoli, Marco Bertini, Dimosthenis Karatzas
ECCVW 2024 ComiCap: A VLMs Pipeline for Dense Captioning of Comic Panels Emanuele Vivoli, Niccolò Biondi, Marco Bertini, Dimosthenis Karatzas
WACV 2024 STEP - Towards Structured Scene-Text Spotting Sergi Garcia-Bordils, Dimosthenis Karatzas, Marçal Rusiñol
AAAI 2023 Show, Interpret and Tell: Entity-Aware Contextualised Image Captioning in Wikipedia Khanh Nguyen, Ali Furkan Biten, Andrés Mafla, Lluís Gómez, Dimosthenis Karatzas
AAAI 2023 Text-DIAE: A Self-Supervised Degradation Invariant Autoencoder for Text Recognition and Document Enhancement Mohamed Ali Souibgui, Sanket Biswas, Andrés Mafla, Ali Furkan Biten, Alicia Fornés, Yousri Kessentini, Josep Lladós, Lluís Gómez, Dimosthenis Karatzas
ICCVW 2023 Understanding Video Scenes Through Text: Insights from Text-Based Video Question Answering Soumya Jahagirdar, Minesh Mathew, Dimosthenis Karatzas, C. V. Jawahar
WACV 2023 Watching the News: Towards VideoQA Models That Can Read Soumya Jahagirdar, Minesh Mathew, Dimosthenis Karatzas, C. V. Jawahar
WACV 2022 InfographicVQA Minesh Mathew, Viraj Bagal, Rubèn Tito, Dimosthenis Karatzas, Ernest Valveny, C.V. Jawahar
WACV 2022 Is an Image Worth Five Sentences? a New Look into Semantics for Image-Text Matching Ali Furkan Biten, Andrés Mafla, Lluís Gómez, Dimosthenis Karatzas
WACV 2022 Let There Be a Clock on the Beach: Reducing Object Hallucination in Image Captioning Ali Furkan Biten, Lluís Gómez, Dimosthenis Karatzas
ECCVW 2022 MUST-VQA: MUltilingual Scene-Text VQA Emanuele Vivoli, Ali Furkan Biten, Andrés Mafla, Dimosthenis Karatzas, Lluís Gómez
ECCVW 2022 OCR-IDL: OCR Annotations for Industry Document Library Dataset Ali Furkan Biten, Rubèn Tito, Lluís Gómez, Ernest Valveny, Dimosthenis Karatzas
WACV 2022 One-Shot Compositional Data Generation for Low Resource Handwritten Text Recognition Mohamed Ali Souibgui, Ali Furkan Biten, Sounak Dey, Alicia Fornés, Yousri Kessentini, Lluís Gómez, Dimosthenis Karatzas, Josep Lladós
ECCVW 2022 Out-of-Vocabulary Challenge Report Sergi Garcia-Bordils, Andrés Mafla, Ali Furkan Biten, Oren Nuriel, Aviad Aberdam, Shai Mazor, Ron Litman, Dimosthenis Karatzas
WACV 2021 DocVQA: A Dataset for VQA on Document Images Minesh Mathew, Dimosthenis Karatzas, C.V. Jawahar
WACV 2021 Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval Andres Mafla, Sounak Dey, Ali Furkan Biten, Lluis Gomez, Dimosthenis Karatzas
WACV 2021 StacMR: Scene-Text Aware Cross-Modal Retrieval Andres Mafla, Rafael S. Rezende, Lluis Gomez, Diane Larlus, Dimosthenis Karatzas
WACV 2020 Exploring Hate Speech Detection in Multimodal Publications Raul Gomez, Jaume Gibert, Lluis Gomez, Dimosthenis Karatzas
WACV 2020 Fine-Grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual Features Andres Mafla, Sounak Dey, Ali Furkan Biten, Lluis Gomez, Dimosthenis Karatzas
ECCV 2020 Location Sensitive Image Retrieval and Tagging Raul Gomez, Jaume Gibert, Lluis Gomez, Dimosthenis Karatzas
ECCVW 2018 Learning from #Barcelona Instagram Data What Locals and Tourists Post About Its Neighbourhoods Raul Gomez, Lluís Gómez, Jaume Gibert, Dimosthenis Karatzas
ECCVW 2018 Learning to Learn from Web Data Through Deep Semantic Embeddings Raul Gomez, Lluís Gómez, Jaume Gibert, Dimosthenis Karatzas
ECCV 2018 Single Shot Scene Text Retrieval Lluis Gomez, Andres Mafla, Marcal Rusinol, Dimosthenis Karatzas
CVPRW 2018 Word Spotting in Scene Images Based on Character Recognition Dena Bazazian, Dimosthenis Karatzas, Andrew D. Bagdanov
ICCVW 2017 Reading Text in the Wild from Compressed Images Leonardo Galteri, Dena Bazazian, Lorenzo Seidenari, Marco Bertini, Andrew D. Bagdanov, Anguelos Nicolaou, Dimosthenis Karatzas, Alberto Del Bimbo
CVPR 2017 Self-Supervised Learning of Visual Features Through Embedding Images into Text Topic Spaces Lluis Gomez, Yash Patel, Marcal Rusinol, Dimosthenis Karatzas, C. V. Jawahar
ECCV 2016 Dynamic Lexicon Generation for Natural Scene Images Yash Patel, Lluís Gómez i Bigorda, Marçal Rusiñol, Dimosthenis Karatzas
ECCVW 2016 Dynamic Lexicon Generation for Natural Scene Images Yash Patel, Lluís Gómez i Bigorda, Marçal Rusiñol, Dimosthenis Karatzas