What's most impressive is they've separated the rendering algorithm (Mitsuba 2) from the retargeting framework (Enoki).
Enoki look amazingfrom their paper. It supports vectorized CPUs, JITTed GPUs, forward/back autodiff, nested array types.
Mitsuba 2 then expands that range even further by templating on key types and operations. For example, a materials color property might be represented by a RGB tuple for basic rendering, or an array that captures the full spectrum of light frequencies for a spectral renderer. They supply some example code, which is absurdly clean, as it's devoid of any specifics of storage and calculation, and focusses just on the high level algorithm.
They claim that the GPU impl is superior to PyTorch / TensorFlow in some regards as it can split the difference between eagerly sending every operation to the GPU, or processing the entire graph at once.
The amount of work and understanding to produce something like this is insane - they just casually mention how they've implemented a novel light transport scheme, an "extensive mathematical support library", and sophisticated python bindings.
(c) ("caustic design" or "caustic engineering") is fascinating to me. I'd never heard of this before, and it looks like it can even be done with real-world materials:
The original Mitsuba, also by Wenzel Jakob, is an extremely fast, clean, and modular piece of software engineering, in a space with not very many excellent free implementations; this looks to be quite an extraordinary successor. With the ability to be compiled to run on GPUs, it could even conceivably compete with Cycles, if the community picked it up and ran with it.
So the biggest difference by far is the "inverse" bit.
Instead of taking a set of parameters, reflection coefficients, material colors etc, and generating an image, the inverse renderer takes an image and tries to derive the parameters.
As they mention in the video, this allows for a lot of interesting usages which you cannot really do in a forward renderer except by brute force.
The renderer is capable of making incredibly life-like images, but it still takes a _lot_ of effort to make a life-like scene.
The most striking to me is that it's not naturally lit. The light from the windows looks like someone put studio lights outside each one, rather than the sun and sky. Given that Mitsuba supports spectral rendering, an empirical spectral sky/sun model would improve the quality of the lighting by several orders of magnitude, and assuming such model is present in Mitsuba would be an easy change.
Secondly the glossy surfaces lack surface details and natural variation, they are unnaturally even and lacking imperfections.
There are other things as well but those are the main ones to fix IMHO. Fixing the lighting should be simple as mentioned, but adding the surface detail can be quite time consuming.
In this case the image is meant as an illustration, so the simplifications made are quite acceptable.
When using a physically based, photorealistic renderer and you model an indoor or studio scene as realistically as possible, you often find that it looks flat, dull or just bad.
And to fix it, you have to employ the similar tricks that professional photographers and cinematographers do, adding extra light sources to reduce harsh shadows, add highlights etc.
So by making the renderer more realistic you've got to fake it like a photographer.
Bad as in "unrealistic", i.e. lacking details and textures? Probably because they're researchers not 3D game designers. My guess is that it is just showing a standardized raytracing scene from a public research dataset used to compare raytracers.
The scene is demonstrating their "lightpath vectorization". If the claims work out, the real gains are the better use of the full hardware capabilities by vectorizing multiple rays without GPU/SIMD branch divergence - the divergence happens happens when different rays intersect different objects and dramatically slow down parallel work. That should really speed up rendering so allow more rays and create less noise, more detail.
The funny thing is this looks to me like a realistic picture of the inside of the McMansion. IE, a room whose contents are more or less "fake" things. Plastic "wood grained" floor (Pergo or whatever), Plastic "wood grained" table (fiberboard, etc), Brand-new polyester mid-tier couch (room never actually used), Pre-made pseudo-fancy gallery/bay windows, higher grade fake ferns...
It's the typical story of CG researchers vs CG artists. Check out innumerable SIGGRAPH presentations using the same Stanford Armadillo model with constant-valued textures!
If you look at the point on the floor where the rays 'bounce', you'll see that the rays are casting shadows too, that can't be associated with anything else in the scene.
In any case, that's the only issue (however minor) I can see for the visual quality, and if that isn't what's being referred to, I have no idea what about the image 'looks bad'.
I agreed, but, when you zoom in there's something off. But to my eye, its something about the texture's texture? Ex. the wall/plant look a lot more smooth than I'd expect
What's most impressive is they've separated the rendering algorithm (Mitsuba 2) from the retargeting framework (Enoki).
Enoki look amazingfrom their paper. It supports vectorized CPUs, JITTed GPUs, forward/back autodiff, nested array types.
Mitsuba 2 then expands that range even further by templating on key types and operations. For example, a materials color property might be represented by a RGB tuple for basic rendering, or an array that captures the full spectrum of light frequencies for a spectral renderer. They supply some example code, which is absurdly clean, as it's devoid of any specifics of storage and calculation, and focusses just on the high level algorithm.
They claim that the GPU impl is superior to PyTorch / TensorFlow in some regards as it can split the difference between eagerly sending every operation to the GPU, or processing the entire graph at once.
The amount of work and understanding to produce something like this is insane - they just casually mention how they've implemented a novel light transport scheme, an "extensive mathematical support library", and sophisticated python bindings.