Mirzasoleiman, Baharan

48 publications

ICLR 2025 Dataset Distillation via Knowledge Distillation: Towards Efficient Self-Supervised Pre-Training of Deep Networks Siddharth Joshi, Jiayi Ni, Baharan Mirzasoleiman
ICLR 2025 Mini-Batch Coresets for Memory-Efficient Language Model Training on Data Mixtures Dang Nguyen, Wenhan Yang, Rathul Anand, Yu Yang, Baharan Mirzasoleiman
TMLR 2025 Occam’s Razor for SSL: Memory-Efficient Parametric Instance Discrimination Eric Gan, Patrik Reizinger, Alice Bizeul, Attila Juhos, Mark Ibrahim, Randall Balestriero, David Klindt, Wieland Brendel, Baharan Mirzasoleiman
ICML 2025 Representations Shape Weak-to-Strong Generalization: Theoretical Insights and Empirical Predictions Yihao Xue, Jiping Li, Baharan Mirzasoleiman
ICML 2025 Synthetic Text Generation for Training Large Language Models via Gradient Matching Dang Nguyen, Zeman Li, Mohammadhossein Bateni, Vahab Mirrokni, Meisam Razaviyayn, Baharan Mirzasoleiman
ICML 2024 Better Safe than Sorry: Pre-Training CLIP Against Targeted Data Poisoning and Backdoor Attacks Wenhan Yang, Jingdong Gao, Baharan Mirzasoleiman
NeurIPS 2024 Changing the Training Data Distribution to Reduce Simplicity Bias Improves In-Distribution Generalization Tuan Hai Dang Nguyen, Paymon Haddad, Eric Gan, Baharan Mirzasoleiman
ICLR 2024 Data Distillation Can Be like Vodka: Distilling More Times for Better Quality Xuxi Chen, Yu Yang, Zhangyang Wang, Baharan Mirzasoleiman
AISTATS 2024 Data-Efficient Contrastive Language-Image Pretraining: Prioritizing Data Quality over Quantity Siddharth Joshi, Arnav Jain, Ali Payani, Baharan Mirzasoleiman
ICML 2024 Few-Shot Adaptation to Distribution Shifts by Mixing Source and Target Embeddings Yihao Xue, Ali Payani, Yu Yang, Baharan Mirzasoleiman
UAI 2024 Graph Contrastive Learning Under Heterophily via Graph Filters Wenhan Yang, Baharan Mirzasoleiman
AISTATS 2024 Identifying Spurious Biases Early in Training Through the Lens of Simplicity Bias Yu Yang, Eric Gan, Gintare Karolina Dziugaite, Baharan Mirzasoleiman
ICLR 2024 Investigating the Benefits of Projection Head for Representation Learning Yihao Xue, Eric Gan, Jiayi Ni, Siddharth Joshi, Baharan Mirzasoleiman
UAI 2024 Investigating the Impact of Model Width and Density on Generalization in Presence of Label Noise Yihao Xue, Kyle Whitecross, Baharan Mirzasoleiman
ICML 2024 NeWRF: A Deep Learning Framework for Wireless Radiation Field Reconstruction and Channel Prediction Haofan Lu, Christopher Vattheuer, Baharan Mirzasoleiman, Omid Abari
NeurIPS 2024 SmallToLarge (S2L): Scalable Data Selection for Fine-Tuning Large Language Models by Summarizing Training Trajectories of Small Models Yu Yang, Siddhartha Mishra, Jeffrey Chiang, Baharan Mirzasoleiman
ICLR 2024 Understanding the Robustness of Multi-Modal Contrastive Learning to Distribution Shift Yihao Xue, Siddharth Joshi, Dang Nguyen, Baharan Mirzasoleiman
ICML 2023 Data-Efficient Contrastive Self-Supervised Learning: Most Beneficial Examples for Supervised Learning Contribute the Least Siddharth Joshi, Baharan Mirzasoleiman
AISTATS 2023 High Probability Bounds for Stochastic Continuous Submodular Maximization Evan Becker, Jingdong Gao, Ted Zadouri, Baharan Mirzasoleiman
ICML 2023 Mitigating Spurious Correlations in Multi-Modal Models During Fine-Tuning Yu Yang, Besmira Nushi, Hamid Palangi, Baharan Mirzasoleiman
NeurIPS 2023 Robust Contrastive Language-Image Pretraining Against Data Poisoning and Backdoor Attacks Wenhan Yang, Jingdong Gao, Baharan Mirzasoleiman
NeurIPS 2023 Robust Learning with Progressive Data Expansion Against Spurious Correlation Yihe Deng, Yu Yang, Baharan Mirzasoleiman, Quanquan Gu
ICML 2023 Towards Sustainable Learning: Coresets for Data-Efficient Deep Learning Yu Yang, Hao Kang, Baharan Mirzasoleiman
ICMLW 2023 Which Features Are Learned by Contrastive Learning? on the Role of Simplicity Bias in Class Collapse and Feature Suppression Yihao Xue, Siddharth Joshi, Eric Gan, Pin-Yu Chen, Baharan Mirzasoleiman
ICML 2023 Which Features Are Learnt by Contrastive Learning? on the Role of Simplicity Bias in Class Collapse and Feature Suppression Yihao Xue, Siddharth Joshi, Eric Gan, Pin-Yu Chen, Baharan Mirzasoleiman
ICML 2022 Adaptive Second Order Coresets for Data-Efficient Machine Learning Omead Pooladzandi, David Davini, Baharan Mirzasoleiman
AAAI 2022 CrossWalk: Fairness-Enhanced Node Representation Learning Ahmad Khajehnejad, Moein Khajehnejad, Mahmoudreza Babaei, Krishna P. Gummadi, Adrian Weller, Baharan Mirzasoleiman
NeurIPS 2022 Data-Efficient Augmentation for Training Neural Networks Tian Yu Liu, Baharan Mirzasoleiman
NeurIPS 2022 Friendly Noise Against Adversarial Noise: A Powerful Defense Against Data Poisoning Attack Tian Yu Liu, Yu Yang, Baharan Mirzasoleiman
NeurIPSW 2022 Generating High Fidelity Synthetic Data via Coreset Selection and Entropic Regularization Omead Pooladzandi, Pasha Khosravi, Erik Nijkamp, Baharan Mirzasoleiman
ICML 2022 Investigating Why Contrastive Learning Benefits Robustness Against Label Noise Yihao Xue, Kyle Whitecross, Baharan Mirzasoleiman
ICMLW 2022 Investigating Why Contrastive Learning Benefits Robustness Against Label Noise Yihao Xue, Kyle Whitecross, Baharan Mirzasoleiman
ICML 2022 Not All Poisons Are Created Equal: Robust Training Against Data Poisoning Yu Yang, Tian Yu Liu, Baharan Mirzasoleiman
ICML 2020 Coresets for Data-Efficient Training of Machine Learning Models Baharan Mirzasoleiman, Jeff Bilmes, Jure Leskovec
UAI 2020 Coresets for Estimating Means and Mean Square Error with Limited Greedy Samples Saeed Vahidian, Baharan Mirzasoleiman, Alexander Cloninger
NeurIPS 2020 Coresets for Robust Training of Deep Neural Networks Against Noisy Labels Baharan Mirzasoleiman, Kaidi Cao, Jure Leskovec
ICLR 2020 Selection via Proxy: Efficient Data Selection for Deep Learning Cody Coleman, Christopher Yeh, Stephen Mussmann, Baharan Mirzasoleiman, Peter Bailis, Percy Liang, Jure Leskovec, Matei Zaharia
NeurIPS 2018 Dynamic Network Model from Partial Observations Elahe Ghalebi, Baharan Mirzasoleiman, Radu Grosu, Jure Leskovec
AAAI 2018 Streaming Non-Monotone Submodular Maximization: Personalized Video Summarization on the Fly Baharan Mirzasoleiman, Stefanie Jegelka, Andreas Krause
ICML 2017 Deletion-Robust Submodular Maximization: Data Summarization with “the Right to Be Forgotten” Baharan Mirzasoleiman, Amin Karbasi, Andreas Krause
AISTATS 2017 Guaranteed Non-Convex Optimization: Submodular Maximization over Continuous Domains Andrew An Bian, Baharan Mirzasoleiman, Joachim M. Buhmann, Andreas Krause
JMLR 2016 Distributed Submodular Maximization Baharan Mirzasoleiman, Amin Karbasi, Rik Sarkar, Andreas Krause
ICML 2016 Fast Constrained Submodular Maximization: Personalized Data Summarization Baharan Mirzasoleiman, Ashwinkumar Badanidiyuru, Amin Karbasi
NeurIPS 2016 Fast Distributed Submodular Cover: Public-Private Data Summarization Baharan Mirzasoleiman, Morteza Zadimoghaddam, Amin Karbasi
ICML 2016 Learning Sparse Combinatorial Representations via Two-Stage Submodular Maximization Eric Balkanski, Baharan Mirzasoleiman, Andreas Krause, Yaron Singer
NeurIPS 2015 Distributed Submodular Cover: Succinctly Summarizing Massive Data Baharan Mirzasoleiman, Amin Karbasi, Ashwinkumar Badanidiyuru, Andreas Krause
AAAI 2015 Lazier than Lazy Greedy Baharan Mirzasoleiman, Ashwinkumar Badanidiyuru, Amin Karbasi, Jan Vondrák, Andreas Krause
NeurIPS 2013 Distributed Submodular Maximization: Identifying Representative Elements in Massive Data Baharan Mirzasoleiman, Amin Karbasi, Rik Sarkar, Andreas Krause