Visualization of the Sampling Process
Airplane
Car
Chair
Table
Reconstructing high-quality point clouds from images remains challenging in computer vision. Existing generative models, particularly diffusion models, based approaches that directly learn the posterior may suffer from inflexibility—they require conditioning signals during training, support only a fixed number of input views, and need complete retraining for different measurements. Recent diffusion-based methods have attempted to address this by combining prior models with likelihood updates, but they rely on heuristic fixed step sizes for the likelihood update that lead to slow convergence and suboptimal reconstruction quality. We advance this line of approach by integrating our novel Forward Curvature-Matching (FCM) update method with diffusion sampling. Our method dynamically determines optimal step sizes using only forward automatic differentiation and finite-difference curvature estimates, enabling precise optimization of the likelihood update. This formulation enables high-fidelity reconstruction from both single-view and multi-view inputs, and supports various input modalities through simple operator substitution—all without retraining. Experiments on ShapeNet and CO3D datasets demonstrate that our method achieves superior reconstruction quality at matched or lower NFEs, yielding higher F-score and lower CD and EMD, validating its efficiency and adaptability for practical applications.
We integrate Forward Curvature-Matching (FCM) with diffusion models to enable precise, training-free 3D reconstruction. By dynamically computing optimal step sizes via forward automatic differentiation—avoiding heuristic fixed steps or complex adjoints—our method achieves high-fidelity results.
We evaluate our method with single-view 3D reconstruction tasks on the ShapeNet dataset, covering four distinct object categories: (1) airplane, (2) car, (3) chair, and (4) table. Our approach outperforms existing diffusion-based baselines, specifically PC2 and BDM, in terms of F-score, Chamfer Distance (CD), and Earth Mover's Distance (EMD), achieving superior geometric accuracy.
We further evaluate our method on the real-world CO3D dataset, focusing on challenging categories such as (1) hydrant and (2) teddybear, along with multi-view reconstruction scenarios. Our method outperforms the baseline PC2 in terms of visual fidelity and structural detail preservation, demonstrating robust zero-shot generalization to real-world captures.
@inproceedings{shin2025FCM,
author = {Shin, Seunghyeok and Kim, Dabin and Lim, Hongki},
title = {Adaptive 3D Reconstruction via Diffusion Priors and Forward Curvature-Matching Likelihood Updates},
booktitle = {The Thirty-Ninth Annual Conference on Neural Information Processing Systems},
year = {2025},
url = {https://openreview.net/forum?id=IJLqUjtrls}
}