I think they are doing that because using real images the model changes the face. So that problem is removed if the initial image doesn't show the face
This genereation seems that is getting performance using more power and more cores. Not really an architectural change but only packing more things in the chip that require more power.
Too true. I've been looking replace my 1080. This was a beast in 2016, but the only way I can get a more performant card these days is to double the power draw. That's not really progress.
Then get a modern GPU and limit the power to what your 1080 draws. It will still be significantly faster. GPU power is out of control these days, if you knock 10% off the power budget you generally only lose a few percentage of performance.
Cutting the 5090 down from 575w to 400w is a 10% perf decrease.
5090 was an example, same process applies to lower tier GPUs that don't require extra power cables. ie a 3080 with the same power budget as a 1080 would run circles around it (1080 with default max power limit of 180w gets approx 7000 in TimeSpy, 3090 limited to 150w gets approx 11500). Limiting the power budget is very simple with tools such as MSI Afterburner and others in the same space.
Gaussian splatting transform images to a cloud points. GPUs can render these points but it is a very slow process. You need to transform the cloud points to meshes. So basically is the initial process to capture environments before converting them to 3D meshes that the GPUs can use for anything you want. It is much cheaper to use pictures to have a 3D representantion of an object or environment than buying professional stuff.
> Gaussian splatting transform images to a cloud points.
Not exactly. The "splats" are both spread out in space (big ellipsoids), partially transparent (what you end up seeing is the composite of all the splats you can see in a given direction) AND view dependent (they render differently depending on the direction you are looking.
Also - there's not a simple spatial relationship between splats and solid objects. The resulting surfaces are a kind of optical illusion based on all the splats you're seeing in a specific direction. (some methods have attempted to lock splats more closely to the surfaces they are meant to represent but I don't know what the tradeoffs are).
Generating a mesh from splats is possible but then you've thrown away everything that makes a splat special. You're back to shitty photogrammetry. All the clever stuff (which is a kind of radiance capture) is gone.
Splats are a lot faster to render than NeRFs - which is their appeal. But heavier than triangles due to having to sort them every frame (because transparent objects don't composite correctly without depth sorting)
Minor nit — in what way do splats render differently depending on direction of looking? To my mind these are probabilistic ellipsoids in 3D (or 4D for motion splats) space, and so while any novel view will see a slightly different shape, that’s an artifact of the view changing, not the splat. Do I understand it (or you) correctly?
Basically for each Gaussian there is a set of coefficients and those are used to calculate what color should be rendered depending on the viewing angle of the camera. And the SH coeffs are optimized through gradient descent just like the other parameters including position and shape.
Basically you train a model per each set of images. The model is a neural network able to render the final image. Different images will require different trained models. Initial gaussian splatting models took hours to train, last year models took minutes to train. I am not sure how much this one takes, but it should be between minutes and hours (and probably more close to minutes than hours).
No, what you're describing is NeRF, the predecessor technology.
The output of Gaussian Splat "training" is a set of 3d gaussians, which can be rendered very quickly. No ML involved at all (only optimisation)!
They usually require running COLMAP first (to get the relative location of camera between different images), but NVIDIA's InstantSplat doesn't (it however does use a ML model instead!)
You are not going to take the expensive human out of the loop where downside risk is high. You are likely to take the human out of the loop only in low risk low cost operations to begin with. For those use cases, these models are quite expensive.
My problem with this analysis is ignoring the fact of who is using which computer. So far new people in the company get the M3, while old people have M2, and the people who has been the longest time in the company have an M1. Who is going to work on more critical tasks with more changes in the code? who is going to work mostly in easy bugs until they get some experience with the code in the company? I bet you if you give both populations the same computer the compiling times are going to be faster for the new people. For me the analysis doesn't have enough dimensions, it should take into account the time since the person was hired in the company and the seniority. I would also have added more type of graphs (boxplots seems a better way to compare the information), and also I would have measure the total % of CPU usage. The battery/AC analysis gave me the impression that M3 might be underutilized and that it is going to be impossible to get lower compiling times without faster single core speeds (which might be a relevant information for the future).
I think kotlin is one example. It uses the same idea but it uses powers of 10 for incremental fixes and numbers for 1 to 9 for hotfixes. That's if for the 3rd number, I do not know what will happen when the second number reaches 2 digits. I guess they will do something to make it comparable again.
You are assuming that the whole existence of humanity is to work? because, without working, they would be sloths?
What about expending more time having healthy habits like working out, meeting more often with family and friends, discovering the world, learning new stuff?
So retired people are just sloths?
I’m more worried about people not being able to feed themselves because their labor became worthless. They will effectively be frozen out of the economy as they have nothing to trade with.
If AI does everything, the economic won't make sense anymore. Maybe there would be a basic rent or just anyone will ask for what they want and AI will provide it.
We though AI would replace the low level jobs first, but it seems creative jobs are gone first (art, software developers, etc). Bear that in mind.
No matter who gets replaced first, someone is getting screwed.
Frankly, if it's the higher end jobs getting replaced first that would likely spill over to the lower end ones as the those people who lost their jobs resort to taking lower end work to survive, flooding the market.
That's under the assumption that nothing else will change. But it is not the case, the system would have to adapt. One possibility is that we wont use money anymore, and there are a lot of in betweens in the middle. But for sure what you cannot do is to stop the change that is coming.
Frankly, I think eventually machines will do it all. I see AGI as the universal automation that can do everything a human can - apart from “being human”.
> How many working horses are there today vs before automobiles?
Well, exactly. The less working horses there are, the more expensive and exclusive it would be to ride them.
There could be 1 trillion automobiles and I bet you, none of these automobiles would compare to riding a real, live horse.
Similarly, there could be 1 trillion AI robots, they could do everything better than a human, and yet I bet you'd still want to ride (or otherwise experience) a real, live human.
My point is that if automobiles were always better than horses in every way, then nobody would want horses. But even today, with the amazing automobiles that we have, some of which even faster and more reliable than most horses, it's clear that we still want horses.
My question is, if horses were as intelligent as us and they could have their basic needs met extremely cheaply, would they be willing to work at all, apart from the occasional ride? Because the horse labor pool would shrink immensely if they didn't really want to work.
Time is the forth dimension. The input data is a video, so the model learns the colors and the position of the elements (basically points). You can rende the scene from any angle at any time once the models is trained
SEEKING WORK | Spain | Remote (EU and US time zones)
Technologies: TensorFlow, Pytorch, Deep learning, LLM, diffusion models, GANs
Résumé/CV: http://jorgemf.github.io/cv.pdf
Personal website: http://jorgemf.github.io
Email: (in the CV)