Traditionally, replication hasn't been done in psychology because the experiments have to be set up in really clever complicated ways in order to tease out effects. Even the most dedicated replicator can't ever fully replicate a psychology experiment. You'd have to ship the actors used as confederates across the country to do that and of course nobody does that.
Now wait a second. When a psychology study comes out, it claims "This experiment shows stereotype thread reduces performance of Honduran men on a Math test." Such studies rarely claim "This experiment shows stereotype threat reduces the performance of Honduran men on a Math test when Jill the skinny experimenter repeats the code words."
If using different experimenters yields a different result, it means the effect being described is probably not as robust as the experimenters claim. The cause might be Jill's shifty eyes rather than stereotype threat.
That's a replication failure, in the sense that it shows the claimed effect is far weaker (or causally different) than the original study claimed.
I would say that it's typically assumed that there are confounding factors in any experiment. An experimenter has to try to minimize that, but you're still not going to be able to replicate a full experiment.
If you have more than one researcher administering a test in a stereotype threat experiment, that'll reduce the likelihood of that being the confound. If other studies (not replications, but their own experiments, with their own, different approaches to study the problem) agree with the first experiment, the effect is likely real.
I would also say that you're viewing the entire system far too antagonistically. Nobody goes into academic psychology to get rich off of it.
But if you're not minimizing the confounding factors enough that the result can be repeated, your experiment failed. Your hypothesis has neither been provably confirmed nor denied. At that point you are, at best, still gathering evidence, and publishing results would be an error.
Now wait a second. When a psychology study comes out, it claims "This experiment shows stereotype thread reduces performance of Honduran men on a Math test." Such studies rarely claim "This experiment shows stereotype threat reduces the performance of Honduran men on a Math test when Jill the skinny experimenter repeats the code words."
If using different experimenters yields a different result, it means the effect being described is probably not as robust as the experimenters claim. The cause might be Jill's shifty eyes rather than stereotype threat.
That's a replication failure, in the sense that it shows the claimed effect is far weaker (or causally different) than the original study claimed.