I hacked in a weighted stochastic method into a WebGPU solution[0] and it worked surprisingly well. Essentially, given each pixel of a splat, pick a pseudo-random number from zero to one and if it’s lower than the alpha value for the pixel, draw over whatever is in the frame buffer at full opacity.
The next step I wanted to try was to super-sample (eg render at double resolution) and linearly down sample. I imagine this would give a slightly noisy result that I think would be pretty visually pleasing.
Aside: the excellent gsplat.tech by Jakub Červený (@jakub_c5y) seems to also be using some sort of VQ/Clustering for the data. Seriously, check it out, it’s probably be nicest gaussian splatting thing right now w.r.t. usability – very intuitive camera controls, nicely presented file sizes, and works on WebGL2. Craftsmanship!
> It's for deep learning, not that much "for graphics".
No, while it is true that there is some overlap between the techniques and concepts used, gaussian splatting isn't necessarily "for deep learning". The library provides a differentiable rasterizer for gaussian splats. This basically means that you can ask it "if I want my output image to change in this and this direction, in what direction should I change the position / orientation / color / ... of my splats?". This enables users to plug it into other software (that is also commonly used for deep learning) and to optimize the parameters of the splats to represent a particular scene.
Since it's primarily a differentiable rasterizer for splats I think it's fair to say that it is "for graphics".
> The problem is "how do you do 3D deep learning 3D scene reconstruction" aka "how to make 3d equivalent of stable diffusion".
That it uses gradient descent doesn't mean that it is "deep learning". There are no neural networks or layers here.
It's not an "equivalent of stable diffusion". The way it's used now is to learn a representation of a single scene, not unlike photogrammetry. Sure, there may be other use cases for this library, but this is primarily what gaussian splatting is about.
This is not really anything to do with deep learning or AI, except that it uses the same global optimisation algorithm (SGD). But that is a generic optimisation algorithm that can be used for any problem where you can calculate the differential of the loss function.
the technique really isn’t “stable diffusion” at all. I’ve seen a couple papers build 3D generative models on top of GS though. Also depending on the measure of efficiency, 3DGS isn’t more efficient. Maybe more efficient than other NeRF methods but less than explicitly geometry representation.
And like another person explained, there’s no deep learning. There’s not even a neural net. In the other NeRF literature there are neural nets but they usually aren’t deep. RegNeRF uses a deep neural network along side a shallow net for regularization.
Gaussian splatting, at least in 3D, is rasterization technique that, AFAIK, doesn't use polygons, it's intended to allow photorealistic renders based on photogrammetry.
Anyway, this doesn't seem to allow any sort of dynamic lighting whatsoever, but for scans of existing places, such as house tours, street views and so on, this seems super promising.
> Anyway, this doesn't seem to allow any sort of dynamic lighting whatsoever, but for scans of existing places, such as house tours, street views and so on, this seems super promising.
So this method can only really be used for static scenes made from a point cloud?
I remember 10 years ago, a company called Euclideon revealed a demo[1] of a rendering technique (they called it "Unlimited Detail Real-Time Rendering") that put nearly every other rasterization technique of the time to shame. However, the major issue with the demo was that it only showed a static scene with zero dynamic lighting. I don't think they ever revealed the "secret sauce" of how it works, but it seems to have similar limitations to this new process, or at least that's what I expect considering it can't handle something as simple as a dynamic mesh.
It was extremely hyped up in certain areas, but over a decade later this technology isn't really used in any mainstream rendering engines because it only works for static scenes.
I hate to disparage new rendering techniques, but this really feels like a repeat of what happened back then. What exactly is different in 3D Guassian Splatting? Can it be used in any dynamic scenes in real-time at all? Does it provide any significant advantages from this older system?
Think of them as point clouds + lighting. Relighting is viable. Procedural deformation is viable. Animation is just a case of transforming sub-groupings.
Collisions might be a bit trickier but you can always derive a low-res collision mesh from the input data and use that as a proxy.
It's early days at the moment and people are still exploring the possibilities.
Thank you for sharing that. Those images have dissuaded my pessimism a bit. They prove at the very least the technique can be used in some non-static scenes.
The splats are pretty easy to manipulate. At least as easy as triangles. It’s just that there has not been much attention paid to them historically. So, there are no content pipelines yet.
Just a shower thought so to speak, but could you combine this technique with something similar to precomputed radiance transfer[1]?
You'd have to take multiple pictures of the scene, then move some light source around, take another set of pictures etc. And in a similar sense to the irradiance volumes[1], instead of encoding just the gaussian parameters, encode them using something that lets you reconstruct the gaussian parameters based on the position of the primary light source for example. I know estimating light position and such from images has been worked on for image-based BRDF extraction for a long time[2].
Of course it'll require a lot more images and compute, but that's the nature of the dynamic beast.
Again not really thought this through and it's not really my field, though I was into physically-based rendering a decade ago. Just seems like this is something that seems like it would be solved by natural progression in a not too distant future.
The problem with "unlimited detail" wasn't that it was static (the majority of any game's environment are), it's that it was using voxels which can't really compete with triangles when it comes to a quality-perf trade-off. They could render massive data sets, but not with a quality that is needed for games. Voxel-based data sets tend to require a whole lot of memory, whereas triangle-based data sets can cheaply "fake" higher details with textures. The blockiness of voxels is also a huge issue for anything that's not an axis-aligned-bounding-box, and to fix that you have to invest so much GPU resources, you might as well go back to textured triangles.
I wouldn't be surprised if gaussian splats make it into AAA games, though. Not as the main rendering primitive, but for specific things like vegetation where they really kick ass.
It’s more of a reconstruction technique than rendering technique. But the power is the 3D differentiable renderer. That means we can optimize an output image to look exactly as we want - given sufficient input data. If you want to learn more take a look at differentiable rendering and 3D multiview reconstruction.
Euclideon's method probably uses fine-grained acceleration structures and LoDs that are too expensive to rebuild in real-time. At least, that's how I took it.
Definitly not. It was just a sparse voxel engine with model instancing. Didn‘t go anywhere and for good reason. Nanite does build on some very advanced and creative poly reduction tech and adds streaming to take it to the next level.
Right. Note that Gaussian splatting as a rendering primitive dates from the early 90s, but it never saw much use. Splats aren't very good for magnification (important for medical/scientific/engineering visualization), nor do they have easy support for domain repetition (important for video games).
The new thing is fast regression of a real light field into Gaussian splats, which can then be rendered at reasonable rates. Dynamic lighting requires full inverse rendering as a preprocess, which is way beyond the scope of the technique. Technically Gaussian splats could form part of an inverse rendering pipeline, and also be the final target representation, but I'm not sure there would be any benefit over alternative rendering primitives.
One reason this implementation is exciting is that the code that accompanies the Gaussian Splatting paper requires companies wanting to use the tech commercially to pay for a commercial license. There are existing viewer implementations, but this seems like the first alternative that would let me do everything without depending on the Inria code. The code being nice and clean is a bonus too :)
Is there any method for animating scenes captured using this method? It would be really interesting to build a hyper realistic world in a game with moving foliage or being able to "use" an item and watch it turn. Would be great for Myst like games.
>Gaussian splats are 3d, but currently rendered as 2d cards ordered on centre, not per pixel like they should be.
>So as centres reorder, they "shuffle". This is a showstopper for general adoption in VR
>The solution? Ordering per pixel will solve this, as would stochastic methods which write depth correctly (though those have other problems for VR).
>For it to be viable you need to cull much more aggressively but also can't have any warp divergence when processing samples.
>Unforunately those things are in direct conflict. Paper's renderer is at extreme end of "weak cull, perfect warp convergence".
Apparently need some new Order-Independent Transparency algorithms that handle depth layers.
[1] https://x.com/charshenton/status/1710207169407447396?s=20