CINXE.COM
NeROIC: Neural Rendering of Objects from Online Image Collections
<!DOCTYPE html> <html> <head lang="en"> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <meta http-equiv="x-ua-compatible" content="ie=edge"> <title>NeROIC: Neural Rendering of Objects from Online Image Collections</title> <meta name="description" content=""> <meta name="viewport" content="width=device-width, initial-scale=1"> <!-- <base href="/"> --> <link rel="stylesheet" href="./resources/bootstrap.min.css"> <link rel="stylesheet" href="./resources/font-awesome.min.css"> <link rel="stylesheet" href="./resources/codemirror.min.css"> <link rel="stylesheet" href="./resources/app.css"> <link rel="stylesheet" href="./resources/bootstrap.min(1).css"> <script src="./resources/jquery.min.js"></script> <script src="./resources/bootstrap.min.js"></script> <script src="./resources/codemirror.min.js"></script> <script src="./resources/clipboard.min.js"></script> <script src="./resources/app.js"></script> </head> <body> <div class="container" id="main"> <div class="row"> <h2 class="col-md-12 text-center"> NeROIC: Neural Rendering of Objects from Online Image Collections<br> </h2> </div> <div class="row"> <div class="col-md-12 text-center"> <ul class="list-inline"> <li> <a href="https://zhengfeikuang.com"> Zhengfei Kuang </a> <br>University of Southern California </li> <li> <a href="https://kyleolsz.github.io/"> Kyle Olszewski </a> <br>Snap Inc. </li> <li> <a href="https://mlchai.com/"> Menglei Chai </a> <br>Snap Inc. </li> <li> <a href="https://zeng.science/"> Zeng Huang </a> <br>Snap Inc. </li> <li> <a href="https://optas.github.io/"> Panos Achlioptas </a> <br>Snap Inc. </li> <li> <a href="http://www.stulyakov.com/"> Sergey Tulyakov </a> <br>Snap Inc. </li> </ul> </div> </div> <div class="row"> <div class="col-md-4 col-md-offset-4 text-center"> <ul class="nav nav-pills nav-justified"> <li> <a href="https://arxiv.org/abs/2201.02533"> <img src="resources/paper-min.png" height="60px"> <h4><strong>Paper</strong></h4> </a> </li> <li> <a href="https://youtu.be/qOfV35y_ppc"> <img src="resources/youtube_icon.png" height="60px"> <h4><strong>Video</strong></h4> </a> </li> <li> <a href="https://github.com/snap-research/NeROIC"> <img src="resources/github.png" height="60px"/> <h4><strong>Code</strong></h4> </a> </li> </ul> </div> </div> <div class="row"> <div class="col-md-8 col-md-offset-2"> <h3> Abstract </h3> <p class="text-justify"> We present a novel method to acquire object representations from online image collections, capturing high-quality geometry and material properties of arbitrary objects from photographs with varying cameras, illumination, and backgrounds. This enables various object-centric rendering applications such as novel-view synthesis, relighting, and harmonized background composition from challenging in-the-wild input. Using a multi-stage approach extending neural radiance fields, we first infer the surface geometry and refine the coarsely estimated initial camera parameters, while leveraging coarse foreground object masks to improve the training efficiency and geometry quality. We also introduce a robust normal estimation technique which eliminates the effect of geometric noise while retaining crucial details. Lastly, we extract surface material properties and ambient illumination, represented in spherical harmonics with extensions that handle transient elements, e.g. sharp shadows. The union of these components results in a highly modular and efficient object acquisition framework. Extensive evaluations and comparisons demonstrate the advantages of our approach in capturing high-quality geometry and appearance properties useful for rendering applications. </p> </div> </div> <div class="row"> <div class="col-md-8 col-md-offset-2"> <h3> Video </h3> <div class="text-center"> <div style="position:relative;padding-top:56.25%;"> <iframe style="position:absolute;top:0;left:0;width:100%;height:100%;" src= "https://www.youtube.com/embed/qOfV35y_ppc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> </div> </div> </div> </div> <div class="row"> <div class="col-md-8 col-md-offset-2"> <h3> Overview </h3> <img src="./resources/framework.png" class="img-responsive" alt="Overview"><br> <p class="text-justify"> Our two-stage model takes images of an object from different conditions as input. With the camera poses of images and object foreground masks acquired by other state-of-the-art methods, we first optimize the geometry of scanned object and refine camera poses by training a NeRF-based network; We then compute the surface normal from the geometry (represented by density function) using our normal extraction layer; Finally, our second stage model decomposes the material properties of the object and solves for the lighting conditions for each image. </p> </p> </div> </div> <div class="row"> <div class="col-xs-12" style="height:20px;"></div> <div class="col-md-8 col-md-offset-2"> <h3> Novel View Synthesis </h3> <p class="text-justify"> Given online images from a common object, our model can synthesize novel views of the object with the lighting conditions from the training images. </p> <br> </div> <div class="col-md-8 col-md-offset-2"> <video id="v0" width="100%" autoplay="" loop="" muted="" controls=""> <source src="videos/nvs.mp4" type="video/mp4"/> </video> </div> <div class="col-xs-8 col-md-offset-2" style="height:40px;"></div> <div class="col-md-8 col-md-offset-2"> <img src="./resources/nvs.png" class="img-responsive" alt="Novel View Synthesis"><br> </div> </div> <div class="row"> <div class="col-xs-12" style="height:20px;"></div> <div class="col-md-8 col-md-offset-2"> <h3> Material Decomposition </h3> <p class="text-justify"> Our model also solves the material properties (including Albedo, Specularity and Roughness maps) and surface normal of the captured object. <br> </div> <div class="col-md-8 col-md-offset-2"> <video id="v0" width="100%" autoplay="" loop="" muted="" controls=""> <source src="videos/material.mp4" type="video/mp4"/> </video> </div> <div class="col-xs-8 col-md-offset-2" style="height:40px;"></div> <div class="col-md-8 col-md-offset-2"> <img src="./resources/material.png" class="img-responsive" alt="Material Decomposition"><br> </div> </div> <div class="row"> <div class="col-xs-12" style="height:20px;"></div> <div class="col-md-8 col-md-offset-2"> <h3> Relighting </h3> <p class="text-justify"> With the material properties and geometry generated from our model, we can further render the object with novel lighting environments. <br> </div> <div class="col-md-8 col-md-offset-2"> <video id="v0" width="100%" autoplay="" loop="" muted="" controls=""> <source src="videos/relighting.mp4" type="video/mp4"/> </video> </div> </div> <div class="row"> <div class="col-md-8 col-md-offset-2"> <h3> Citation </h3> </div> </div> <div class="col-md-8 col-md-offset-2"> <pre><code>@article{10.1145/3528223.3530177, author = {Kuang, Zhengfei and Olszewski, Kyle and Chai, Menglei and Huang, Zeng and Achlioptas, Panos and Tulyakov, Sergey}, title = {NeROIC: Neural Rendering of Objects from Online Image Collections}, year = {2022}, issue_date = {July 2022}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, volume = {41}, number = {4}, issn = {0730-0301}, url = {https://doi.org/10.1145/3528223.3530177}, doi = {10.1145/3528223.3530177}, journal = {ACM Trans. Graph.}, month = {jul}, articleno = {56}, numpages = {12}, keywords = {neural rendering, reflectance & shading models, multi-view & 3D} }</code></pre> </div> <div class="row"> <div class="col-md-8 col-md-offset-2" > <p style="color:gray; text-align:right" > The website template was borrowed from <a href="http://mgharbi.com/">Micha毛l Gharbi</a>. </p> </div> </div> </div> </body> </html>