You actually said "standardized VM," which I took to mean "one VM upon which everybody standardizes."
But even now, it's unclear to me what you mean to be standardized about this VM if not its input and output (as we already have that with JavaScript VMs) and not the implementation. What are you asking for?
A VM with standardized input and output is exactly what I meant. Nevertheless, the underlying implementation of said VMs can still differ but they would all implement the same bytecode standard (similarly as both HotSpot and Dalvik both implement the JVM bytecode standard).
But JavaScript VMs already have standardized input and output. It is not a bytecode, but it is standardized input with standardized output. So we already have what you want except that the input format is different than you envisioned. It seems to me that's the whole idea behind asm.js — they're defining a subset of the language that can be implemented very efficiently with precise low-level semantics so that we get most of the benefit of a bytecode without throwing out all that we have now.