Yang, Yi
339 publications
CVPR
2025
Adapting Text-to-Image Generation with Feature Difference Instruction for Generic Image Restoration
NeurIPS
2025
DeltaPhi: Physical States Residual Learning for Neural Operators in Data-Limited PDE Solving
ICML
2025
DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization
ICCV
2025
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
CVPR
2025
DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery
NeurIPS
2025
Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer
CVPR
2025
EnergyMoGen: Compositional Human Motion Generation with Energy-Based Diffusion Model in Latent Space
ICCV
2025
MaGS: Reconstructing and Simulating Dynamic 3D Objects with Mesh-Adsorbed Gaussian Splatting
ICCV
2025
MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh
CVPRW
2025
NTIRE 2025 Challenge on Short-Form UGC Video Quality Assessment and Enhancement: Methods and Results
ICCV
2025
R1-Onevision: Advancing Generalized Multimodal Reasoning Through Cross-Modal Formalization
NeurIPS
2024
DRIP: Unleashing Diffusion Priors for Joint Foreground and Alpha Prediction in Image Matting
NeurIPS
2024
DataStealing: Steal Data from Diffusion Models in Federated Learning with Multiple Trojans
JAIR
2024
Differentially Private Neural Tangent Kernels (DP-NTK) for Privacy-Preserving Data Generation
NeurIPS
2024
Human-Object Interaction Detection Collaborated with Large Relation-Driven Diffusion Models
ECCV
2024
Mutual Learning for Acoustic Matching and Dereverberation via Visual Scene-Driven Diffusion
CVPR
2024
SIFU: Side-View Conditioned Implicit Function for Real-World Usable Clothed Human Reconstruction
NeurIPS
2024
TOPA: Extending Large Language Models for Video Understanding via Text-Only Pre-Alignment
ICLR
2024
Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models
CVPR
2024
VISTA-LLAMA: Reducing Hallucination in Video Language Models via Equal Distance to Visual Tokens
NeurIPS
2024
VidProM: A Million-Scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models
CVPR
2023
DETR with Additional Global Aggregation for Cross-Domain Weakly Supervised Object Detection
NeurIPSW
2023
DYAD: A Descriptive yet Abjuring Density Efficient Approximation to Linear Neural Network Layers
NeurIPS
2023
Hyperbolic Space with Hierarchical Margin Boosts Fine-Grained Learning from Coarse Labels
ICCV
2023
Integrating Boxes and Masks: A Multi-Object Framework for Unified Visual Tracking and Segmentation
ICCV
2023
MAAL: Multimodality-Aware Autoencoder-Based Affordance Learning for 3D Articulated Objects
CVPR
2023
MIST: Multi-Modal Iterative Spatial-Temporal Transformer for Long-Form Video Question Answering
ICCV
2023
Omnidirectional Information Gathering for Knowledge Transfer-Based Audio-Visual Navigation
CVPR
2023
ProD: Prompting-to-Disentangle Domain Knowledge for Cross-Domain Few-Shot Image Classification
AAAI
2023
Stroke Extraction of Chinese Character Based on Deep Structure Deformable Image Registration
ICCV
2023
TransHuman: A Transformer-Based Human Representation for Generalizable Neural Human Rendering
CVPR
2022
Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning
CVPR
2022
Learning Memory-Augmented Unidirectional Metrics for Cross-Modality Person Re-Identification
CVPR
2022
Locality-Aware Inter- and Intra-Video Reconstruction for Self-Supervised Correspondence Learning
ICCV
2021
Adaptive Hierarchical Graph Reasoning with Semantic Coherence for Video-and-Language Inference
AAAI
2021
Modeling the Probabilistic Distribution of Unlabeled Data for One-Shot Medical Image Segmentation
CVPR
2021
OpenMix: Reviving Known Knowledge for Discovering Novel Visual Categories in an Open World
ICCV
2021
PR-RRN: Pairwise-Regularized Residual-Recursive Networks for Non-Rigid Structure-from-Motion
AAAI
2020
Context Modulated Dynamic Networks for Actor and Action Video Segmentation with Language Queries
ECCV
2020
Learning to Transfer Learn: Reinforcement Learning-Based Selection for Adaptive Transfer Learning
NeurIPS
2020
Pixel-Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation
ECCV
2018
Beyond Part Models: Person Retrieval with Refined Part Pooling (and a Strong Convolutional Baseline)
IJCAI
2018
Watching a Small Portion Could Be as Good as Watching All: Towards Efficient Video Classification
AAAI
2017
Probabilistic Non-Negative Matrix Factorization and Its Robust Extensions for Topic Modeling
AAAI
2016
Concepts Not Alone: Exploring Pairwise Relationships for Zero-Shot Video Activity Recognition
CVPR
2016
Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning
CVPR
2016
They Are Not Equally Reliable: Semantic Event Search Using Differentiated Concept Classifiers
CVPR
2016
You Lead, We Exceed: Labor-Free Video Concept Learning by Jointly Exploiting Web Videos and Images