Implicit Grasp Diffusion: Bridging the Gap Between Dense Prediction and Sampling-Based Grasping
Abstract
There are two dominant approaches in modern robot grasp planning: dense prediction and sampling-based methods. Dense prediction calculates viable grasps across the robot’s view but is limited to predicting one grasp per voxel. Sampling-based methods, on the other hand, encode multi-modal grasp distributions, allowing for different grasp approaches at a point. However, these methods rely on a global latent representation, which struggles to represent the entire field of view, resulting in coarse grasps. To address this, we introduce \emph{Implicit Grasp Diffusion} (IGD), which combines the strengths of both methods by using implicit neural representations to extract detailed local features and sampling grasps from diffusion models conditioned on these features. Evaluations on clutter removal tasks in both simulated and real-world environments show that IGD delivers high accuracy, noise resilience, and multi-modal grasp pose capabilities.
Cite
Text
Song et al. "Implicit Grasp Diffusion: Bridging the Gap Between Dense Prediction and Sampling-Based Grasping." Proceedings of The 8th Conference on Robot Learning, 2024.Markdown
[Song et al. "Implicit Grasp Diffusion: Bridging the Gap Between Dense Prediction and Sampling-Based Grasping." Proceedings of The 8th Conference on Robot Learning, 2024.](https://mlanthology.org/corl/2024/song2024corl-implicit/)BibTeX
@inproceedings{song2024corl-implicit,
title = {{Implicit Grasp Diffusion: Bridging the Gap Between Dense Prediction and Sampling-Based Grasping}},
author = {Song, Pinhao and Li, Pengteng and Detry, Renaud},
booktitle = {Proceedings of The 8th Conference on Robot Learning},
year = {2024},
pages = {2948-2964},
volume = {270},
url = {https://mlanthology.org/corl/2024/song2024corl-implicit/}
}