Publication year and journal Link

Contribution

Previous Nerf approaches takes hours of training for each new scene. But this approaches takes only few seconds which is damn fast.
As previous NERF techniques uses the frequency encoding techniques to encode each scalar position x as multi resolution sequence of L sine and cosine functions which is needed to capture high level of frequency details. This encoding is fixed and needs large MLP to learn complex tasks.

Untitled

The other approaches uses the technique called parametric encoding where additional trainable parameters are arranged in some form of auxiliary data structure such as grid and tree which are looked up based on input and interpolated later.
The above approaches trade off large memory footprint for smaller computationally cost. So instead of using large MLP, it uses small one. Even if the parameters thought to regain its capacity, Only we need to update small subset of its parameters during gradient back propagation Thus greatly speeding the training. E.g., if we represent each point in 3d voxel, we only need to update its 8 embedding reference to the 8 vertex.

Untitled

It makes the use of fully fused CUDA kernels(fusion of operations to speed of computations) and focused on minimizing the wasted bandwidths and computational operations.
The hash table entries are also stored at half precision, maintaining the master copy of parameters in full precision parameters updates.
Optimize the GPU cache, scheduling computation to look up every level of multi-resolution hash encoding for all inputs in a batch before moving to next one.
D-linear interpolation is done as it make the function continuous.

Untitled