Tsolakis, Georgios

1 publications

NeurIPS 2023 WordScape: A Pipeline to Extract Multilingual, Visually Rich Documents with Layout Annotations from Web Crawl Data Maurice Weber, Carlo Siebenschuh, Rory Butler, Anton Alexandrov, Valdemar Thanner, Georgios Tsolakis, Haris Jabbar, Ian Foster, Bo Li, Rick Stevens, Ce Zhang