Publication year and journal Link
Contribution

- Needs different images from multiple locations for representing scene
- Uses the MLP layer to represent the 3d scene
- Input is continous 5D coordinates i.e spatial location and viewing direction
- Output is volume density as function of spatial location and output RGBa color as function as viewing direction and spatial location
- Uses classical volumetric techniques to project the output colors and densities into an predicted image
- Need not any ground truth 3D geometry as it optimizes the neural implicit shape representation using only 2D images
- Existing approaches has been limited to simple shapes with low geometric complexity.
- One approaches uses observed images to directly color voxel grids.
- One approaches uses the combination of CNN and sampled voxel grids for each scene representation but are limited only to lower dimensional resolution imagery
- Optimize the NERF by using positional encoding to convert low frequency input data in low dimensional space (x, y, z, theta, phi) into high frequency dimensional space
- Another methods the NERF has adopted is Hierarchical volume sampling methods which reduces the inefficiency created by large query points that could contains large free space and occluded regions
- Model trained without input viewing direction has difficulty representing specularities.