The Mill [1] does/will do that: When memory is allocated, it is served directly be the cache, which implicitly zeroes the respective cache line (without involving the main memory).
A similar mechanism zeroes a stackframe upon a function call (and function return, I think), which eliminates most attacks that exploit buffer-overflows.
So a Mill CPU can never do shared-memory process parallelization? Two or more processes accessing the same memory could be a hazard. Seems like it would suffer the same issue IA-64 has with parallelism. However, the conveyor belt analogy simplifies register spilling & related issues.
A similar mechanism zeroes a stackframe upon a function call (and function return, I think), which eliminates most attacks that exploit buffer-overflows.
[1] http://millcomputing.com/docs/memory/