The M1 is in a product segment where discrete GPUs have been gone for decades, in favor of integrated graphics that shares one pool of RAM with the CPU. The better question to ask is why Apple kept using that unified memory design even when moving up to larger chips like the M1 Max and M1 Ultra.
The GPU is built into the same physical die as the CPU.
So if you wanted to give it a second ram pool you would have to add an entire second memory interface just for the on-die GPU.
Now all you’ve done is make it more complicated, slower because now you have to move things between the two pools, and gained what exactly?
I think it was a very clear and obvious decision to make. It’s an outgrowth out of how the base chips were designed, and it turned out to be extremely handy for some things. Plus since all their modern devices now work this way that probably simplify the software.
I’m not saying it’s genius foresight, but it certainly worked out rather well. There’s nothing stopping them from supporting discreet GPUs too if they wanted to. They just clearly don’t.
Apple debuted dedicated machine learning hardware in 2017 with the Neural Engine on iPhones. While I don’t think they predicted the LLM explosion in particular, they knew machine learning was important and they have been allowing that to influence hardware design.
Apple has always liked to integrate as much as possible on the same chip. It was only natural that they would come to this conclusion, with the improved perf the cherry on top.