Wang, William Yang
93 publications
ICLR
2025
Generalization V.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data
ICLR
2025
SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement
ICLR
2025
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
AAAI
2025
Unveiling the Impact of Coding Data Instruction Fine-Tuning on Large Language Models Reasoning
ICCV
2025
VSP: Diagnosing the Dual Challenges of Perception and Reasoning in Spatial Planning Tasks for MLLMs
TMLR
2024
A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law
NeurIPS
2024
FASTopic: Pretrained Transformer Is a Fast, Adaptive, Stable, and Transferable Topic Model
ICMLW
2024
Generalization vs. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data
ICML
2024
Mastering Robot Manipulation with Multimodal Prompts Through Pretraining and Multi-Task Fine-Tuning
NeurIPS
2024
T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback
NeurIPSW
2024
Text as Images: Can Multimodal Large Language Models Follow Printed Instructions in Pixels?
AAAI
2024
VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View
NeurIPS
2024
Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2)
ICLR
2023
DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge Bases
NeurIPS
2023
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
NeurIPS
2023
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning
CVPR
2023
Tell Me What Happened: Unifying Text-Guided Video Completion via Multimodal Masked Video Generation
NeurIPSW
2023
ToolDec: Syntax Error-Free and Generalizable Tool Use for LLMs via Finite-State Decoding
NeurIPSW
2023
VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View