This also helps understanding why template code completely wrecks compiletimes and ram usage since the compiler cant share template instantiations. This becomes very relevant if template metaprogramming is used in bigger projects.
That's not why. Template processing happens during processing of the translation unit, so it would be expensive even if you only had a single TU in your build. It's true that a template has to be reprocessed for each instantiation, however, that's not merely from one TU to the next, but even inside each TU! For every distinct value of T for std::vector<T> in a single TU, the compiler has to process and generate std::vector entirely