What exactly is stopping ARM boards to have high IO?
Just allow people to use a normal 2.5" SSD or even m2 and you have a really nice machine!
For example, the latest Raspberry Pi 2 has a quad-core CPU which is faster than my 1st machine (and maybe even my 2nd and 3rd) but is severly handicaped by the IO speed which is limited to USB (plus, all the USB ports and Ethernet using the same controller for extra slowness).
A quad-core ARM with a RAM slot supporting at least 8GB, a giga Ethernet port, 4 separate USB ports and a SATA connection would be glorious.
The Raspberry Pi CPUs were designed for smartphones, so they have a limited USB port that was only intended to support that use case.
To the extent these sorts of boards use CPUs that are cheap because they are made in high volume for other applications, we'll continue to suffer from IO limitations for what we want.
For 32 bit ARM, it seems to be shortcuts in the SoC design a.k.a there's a reason some boards cost under $50.
There exist some designs with real SATA (eg. Cubietruck) but those have poor CPUs (A7), and there exist many designs with SATA-via-USB which is never going to be fast. Also micro SD cards have abysmal performance for the sort of random I/O that operating system root filesystems have.
Luckily the situation in 64 bit ARM server land is much better. The APM Mustang and AMD designs have a combination of fast cores and properly engineered I/O subsystems. Real SATA, multiple 10gigE and 1gigE interfaces, PCIe, etc.
I would say it is not so much shortcuts as the fact that the 32-bit SoCs are generally designed for mobile. The only way to get a cheap devboard is to use an SoC which is being produced in high volume, so it doesn't have a prohibitive cost. That means you get the peripherals that a mobile phone or tablet wants, and not the ones that it doesn't, typically. If you insist on using an SoC custom designed for a devboard then the board is probably going to be tens of thousands of dollars.
64-bit is better because there are SoCs directly targeting server usecases which therefore have the kind of peripherals you'd prefer to see in devboards.
SoCs with 8GB onboard is a pretty niche chip, but other than that it's all out there. The Raspberry Pi/2 is actually a pretty poor offering - you can do better, even at the same price, The Pi Foundation just does a better job of marketing.
Specific models are the ODroid XU4, the CubieBoard or Cubietruck, the Banana Pi, etc.
They're not cheap, but the Jetson TK1 is an extremely powerful offering for this category based on the Tegra K1 SoC. You get SATA, USB 3.0, mPCIE, 1GigE, GPU with 192 Kepler cores, etc. If you need some grunt you can even do CUDA compute. Again, I'm still waiting for my Tegra-X1 based Jetson, NVIDIA :<
Demand mostly. Also the way ARM does I/O is a bit different, that said, a memory mapped "smart" disk controller is pretty straight forward, the trick is getting access to the memory bus crossbar switch inside the chip. Bolting one on, all of your peripherals go through one memory bus.
What would be really interesting is if ARM introduces a way to use a high speed serial bus (think PCIe) which would allow for the development of a 'south bridge' type IO chipset for ARM machines.
Because they're usually based on mobile phone or similar SoCs. However, at least 4 companies are promising to deliver Server-class chips this year - Cavium, Avago (ex Broadcom), Qualcomm, and AMD.
(Edited to add) Incidentally, most of the above are designed to have more than 8 cores. Cavium's ThunderX has 48, for example. Both Cavium and Broadcom previously delivered many-core MIPs designes, so this isn't vaporware either.
Just allow people to use a normal 2.5" SSD or even m2 and you have a really nice machine!
For example, the latest Raspberry Pi 2 has a quad-core CPU which is faster than my 1st machine (and maybe even my 2nd and 3rd) but is severly handicaped by the IO speed which is limited to USB (plus, all the USB ports and Ethernet using the same controller for extra slowness).
A quad-core ARM with a RAM slot supporting at least 8GB, a giga Ethernet port, 4 separate USB ports and a SATA connection would be glorious.