DualNeRF

Generalizable single view reconstruction with NeRF

I worked with Lourdes Agapito during 2020-2021 on DualNeRF, a novel view synthesis method that incorprates the extremely popular implicit neural representation, NeRF. DualNeRF is able to reconstruct 3D geometry and apperance of target object from a single RGB image, and does not require any expensive 3D annotations for supervision. A full Master thesis report can be found here.

DualNeRF Network Architecture

DualNeRF encodes the input view into a 3D feature volume, where each 1D feature vector corresponds to a pixel in the input image. A local NeRF-like decoder conditions on the pixel feature vector that contains the query point, while a global decoder conditions on the whole feature volume to give their predictions of radiance and density respectively. In the end, two set of predictions are combined using weighted sum, where weights are their densities respectively.

Compared to the concurrent work pixelNeRF, DualNeRF uses an extra global decoder to account for global rendering factors, such as illumination brightness. We also use a simplified query input to the local decoder, where the Euclidean distance between camera and query point is supplied instead of query coordinate. This gives more interpretable information to the network and forces it to fully condition on the pixel feature supplied.

Qualitative Evaluation on ShapeNet Car Dataset

The figure above shows a qualitative comparison among our DualNeRF, NeRF and a modified implementation of pixelNeRF, which uses smaller network and less training data due to hardware constraints.