Artificial Intelligence that can easily synthesize realistic 3D object model is not as far-fetched as it might look like. In a report which is accepted at the NeurIPS 2018 conference held in the Montreal, some of the researchers at the MIT computer science and Artificial Intelligence Laboratory and Google describes that a generative AI system which is capable of creating convincing shapes with the realistic textures.
The AI system, Visual objects Network which is commonly known as the VON, not only generates the images that are much more realistic than some state of the art methods, as it also enables the texture and shapes editing, viewpoint shifts, and other three dimensional tweaks.
“Modern deep generative models learn to synthesize realistic images,” the researchers wrote. “Most computational models have only focused on generating a 2D image, ignoring the 3D nature of the world … This 2D-only perspective inevitably limits their practical usages in many fields, such as synthetic data generation, robotic learning, visual reality, and the gaming industry.”
VON simply tackles the issues by jointly synthesizing the 3D shapes and 2D images in a process the researchers refer to as a “disentangled object representation.” The image generation model is much more decomposed into the three factors – texture, viewpoint and shape and firms learns to synthesize 3D shapes before computing “2.5D” sketches and adding some of the more textures.
More importantly, because the three factors are much more independent, the model does not require coupled data between the 2D and 3D shapes. That simply enabled the team to train on large-scale collections of the 2D images and 3D shapes like the Google image search, ShapeNet, and Pixel3D, the latter containing thousands of the CAD model across the 55 object categories.