I dunno. I think this idea occurs to everyone who understands unit testing and then encounters REPLs; it certainly occurred to me under those circumstances and I got excited about it for a while too. Over time, though, it has struck me as less and less obviously good. Though you're right about where the two approaches to programming overlap, and I agree with you that REPL > tests in those areas, there's also considerable territory where they don't overlap. I suspect that xor represents an impedance mismatch that makes "test capture" not as feasible as it seems at first.
I don't mean to pour cold water on the idea, though; if someone figures out a way of doing it that's useful I'd happily change my mind.
I don't mean to pour cold water on the idea, though; if someone figures out a way of doing it that's useful I'd happily change my mind.