It's measuring single TCP connection performance which is already difficult to optimize. With jumbo frames and tuned buffer sizes I'd expect it to get higher, but it will likely be serialized to a single core's worth of CPU. Using multiple connections should give a better representation of available link bandwidth.