Docker is running a daemon with root privileges to start all containers. So if your start a container with "docker run -d ...." you talk to a privileged process. That in turn means, all spawned containers can have root privileges (docker run -v /etc/shadow ... to change the root password of your host). "rootless" actually means running a container process as a normal user. (less attack surface because of less permissions). So if you would run "podman -v /etc/shadow" as a normal user, you wouldn't have the permissions needed to open the file.
As simple as possible:
Docker ("normally"):
run every command inside container with full root permissions on host
$root-> Docker -> container
Docker/Podman ("rootless"):
run every command as the current user
$user-> container
The other big piece is capabilities (specifically CAP_SYS_ADMIN) which as I understand it is related but kind of orthogonal to the question of root/rootless.
For example, buildah (the container-building part of podman) is daemonless and can use the fuse-overlayfs storage driver to build containers rootlessly— you appear as root inside the container, but from the outside, those processes and any files created are owned by the original invoking user or some shim UID/GID based on a mapping table.
But critically, this doesn't mean it's possible to just run buildah inside any Kubernetes pod and build a container there, because buildah needs to be able to start a user namespace, and must have the /dev/fuse device mapped in. I believe there continues to be ongoing work in this area (for example Linux 5.11 allows overlayfs in unprivileged containers), but the issue tracking [1] it is closed without really being IMO fully resolved, since the linked article [2] from July 2021 is still describing the different scenarios as distinct special cases that each require their own special sets of flags/settings/mounts/whatever.
Yup, and based on that mapping table the process inside the container is not allowed to create another namespace and/or fuse-overlayfs. That's why you need to mount /dev/fuse into the container (you might also need cap_sys_admin and cap_mknod). There is another link from RedHat which also explains it:
As simple as possible: Docker ("normally"): run every command inside container with full root permissions on host $root-> Docker -> container Docker/Podman ("rootless"): run every command as the current user $user-> container
Maybe take a look here for a better explanation: https://docs.docker.com/engine/security/#docker-daemon-attac...