Yeung-Levy, Serena

18 publications

CVPR 2025 Apollo: An Exploration of Video Understanding in Large Multimodal Models Orr Zohar, Xiaohan Wang, Yann Dubois, Nikhil Mehta, Tong Xiao, Philippe Hansen-Estruch, Licheng Yu, Xiaofang Wang, Felix Juefei-Xu, Ning Zhang, Serena Yeung-Levy, Xide Xia

CVPR 2025 Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation Yuhui Zhang, Yuchang Su, Yiming Liu, Xiaohan Wang, James Burgess, Elaine Sui, Chenyu Wang, Josiah Aklilu, Alejandro Lozano, Anjiang Wei, Ludwig Schmidt, Serena Yeung-Levy

CVPR 2025 BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature Alejandro Lozano, Min Woo Sun, James Burgess, Liangyu Chen, Jeffrey J. Nirschl, Jeffrey Gu, Ivan Lopez, Josiah Aklilu, Anita Rau, Austin Wolfgang Katzer, Yuhui Zhang, Collin Chiu, Xiaohan Wang, Alfred Seunghoon Song, Robert Tibshirani, Serena Yeung-Levy

ICML 2025 CellFlux: Simulating Cellular Morphology Changes via Flow Matching Yuhui Zhang, Yuchang Su, Chenyu Wang, Tianhong Li, Zoe Wefers, Jeffrey J Nirschl, James Burgess, Daisy Ding, Alejandro Lozano, Emma Lundberg, Serena Yeung-Levy

ICCV 2025 Feather the Throttle: Revisiting Visual Token Pruning for Vision-Language Model Acceleration Mark Endo, Xiaohan Wang, Serena Yeung-Levy

ICLR 2025 Foundation Models Secretly Understand Neural Network Weights: Enhancing Hypernetwork Architectures with Foundation Models Jeffrey Gu, Serena Yeung-Levy

WACV 2025 Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models Elaine Sui, Xiaohan Wang, Serena Yeung-Levy

CVPR 2025 MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research James Burgess, Jeffrey J Nirschl, Laura Bravo-Sánchez, Alejandro Lozano, Sanket Rajan Gupte, Jesus G. Galaz-Montoya, Yuhui Zhang, Yuchang Su, Disha Bhowmik, Zachary Coman, Sarina M Hasan, Alexandra Johannesson, William D. Leineweber, Malvika G Nair, Ridhi Yarlagadda, Connor Zuraski, Wah Chiu, Sarah Cohen, Jan N. Hansen, Manuel D Leonetti, Chad Liu, Emma Lundberg, Serena Yeung-Levy

MLHC 2025 The Impact of Image Resolution on Biomedical Multimodal Large Language Models Liangyu Chen, James Burgess, Jeffrey J Nirschl, Orr Zohar, Serena Yeung-Levy

ICLR 2025 Video Action Differencing James Burgess, Xiaohan Wang, Yuhui Zhang, Anita Rau, Alejandro Lozano, Lisa Dunlap, Trevor Darrell, Serena Yeung-Levy

ICLR 2025 Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision Orr Zohar, Xiaohan Wang, Yonatan Bitton, Idan Szpektor, Serena Yeung-Levy

ECCV 2024 Depth-Guided NeRF Training via Earth Mover’s Distance Anita Rau, Josiah Aklilu, Floyd C Holsinger, Serena Yeung-Levy

CVPR 2024 Describing Differences in Image Sets with Natural Language Lisa Dunlap, Yuhui Zhang, Xiaohan Wang, Ruiqi Zhong, Trevor Darrell, Jacob Steinhardt, Joseph E. Gonzalez, Serena Yeung-Levy

NeurIPS 2024 Micro-Bench: A Microscopy Benchmark for Vision-Language Understanding Alejandro Lozano, Jeffrey Nirschl, James Burgess, Sanket Rajan Gupte, Yuhui Zhang, Alyssa Unell, Serena Yeung-Levy

TMLR 2024 Revisiting Active Learning in the Era of Vision Foundation Models Sanket Rajan Gupte, Josiah Aklilu, Jeffrey J Nirschl, Serena Yeung-Levy

ECCV 2024 VideoAgent: Long-Form Video Understanding with Large Language Model as Agent Xiaohan Wang, Yuhui Zhang, Orr Zohar, Serena Yeung-Levy

ECCV 2024 Viewpoint Textual Inversion: Discovering Scene Representations and 3D View Control in 2D Diffusion Models James Burgess, Kuan-Chieh Wang, Serena Yeung-Levy

NeurIPS 2024 Why Are Visually-Grounded Language Models Bad at Image Classification? Yuhui Zhang, Alyssa Unell, Xiaohan Wang, Dhruba Ghosh, Yuchang Su, Ludwig Schmidt, Serena Yeung-Levy