A standalone x32 binary will run fine on an x64 machine. But if you want to link to any libraries, the library will also have to be x32. So an x64 system, which probably has 32-bit legacy libs as well as normal 64-bit ones, will also need a complete set of x32 libs for x32 to be practical.
Sure. But after the porting work has been done by the distribution vendor, it's done. The package manager software should be able to do whatever is necessary to almost transparently ensure any necessary dependencies are installed for the required sub-architecture. So I would imagine that in most cases, end-users won't notice any hassle except having the option to choose between x32 and x86_64 per package during installation. I think that sounds kind of neat :)
OK, modulo taking up extra space on already-cramped CD distros, taking more time to download updates, taking up more space on production hard disks, and having to download a new version of the software if your dataset grows over 4GB, it sounds good. :)
And extra memory taken up at runtime by having to load the other versions of the libraries. And the I/O costs of reading them in. I'd think that'd outweigh the performance benefits many times over in nearly all cases.