The Kubernetes scheduler also does resource-aware scheduling. You're correct that it tries to spread replicas across nodes, but it only spreads them across the nodes that have enough free resources for the container (more precisely, Pod) that it's scheduling.
Currently resource requirements are specified only on containers, not Pods. The requirements for the Pod are computed by adding up the requirements of the containers within the Pod.
To be more concrete: Within the PodSpec type that you linked to, there is a field of type []Container. Within the Container type there is a field called Resources which is of type ResourceRequirements. ResourceRequirements lets you specify resource requirements of the container. The resource requirements of the Pod are computed by adding up the resource requirements of the containers that run within the Pod.
In addition to resource-based scheduling, we also support "label selectors" which allows you to label nodes with key/value pairs and then say that a Pod should only run on nodes with particular labels. That's specified in the NodeSelector field of the PodSpec (which you linked to).
The Kubernetes scheduler looks at "fit" and "resource availability" to determine the node a pod will run on. Nodes can also have labels like "high-mem" or "ssd", so you can request a particular type of server (via the the nodeSelector field). More details are in the link above.
The page you linked to describes a slightly different feature, namely the ability to restrict and override the resource requirements of Pods at the time they are submitted to the system. So it's part of the admission control system, not part of the scheduling.
Thanks davidooo - I was specifically referring to the section on "limits at the point of creation" which gives a practical example of using limits in a multi-namespace (multi-tenant) environment. (https://github.com/GoogleCloudPlatform/kubernetes/blob/maste...).
The new documentation you linked to has good explanations in it as well.