In 1993, I was taking a course in Software Verification & Validation at the Univ of Houston Clear Lake, next door to Johnson Space Center and down the street from the IBM division doing the Space Shuttle software. That is the group that is CMM Level 5 and gets a bug report about once a year or so. It happened that the instructor of this course was a mid-level IBM tech guy in that organization. And he had stories...
After Challenger blew up, NASA demanded that every shuttle vendor report the cumulative probability that their component of the system would lead to a loss of vehicle accident. NASA took all those probabilities and came up with their best guess of the probability of a loss of vehicle accident for each flight. While Feynman praised the software process for the shuttle, the software group still had to come up with their number. So the instructor said they took all their statistics from the (individual, unique) software loads for each historical flight, and included the failures from their loss of vehicle accidents.
"Say what? The shuttle software hasn't had any loss of vehicle accidents." Well, turns out it had. Each unique software load for each mission is tested and trained against for many months before it flies. Sometimes they fail, just not yet in actual flight. For example, apparently one time the shuttle crew was practicing launch aborts, where the launch is aborted just after clearing the pad and the orbiter lands like a glider. About the only crew member involved in that is the pilot. Everyone else is just strapped in being bored, and after a few hours of sitting still, the co-pilot got "frisky." During the launch phase, he randomly tapped some keys on his keypad and ... BOOM! Loss of vehicle accident.
Monkeys at work! I suppose it could be argued that with all the bumping around during the launch phase, a stray hand could accidentally "fuzz" that keypad.
when i was doing Palm OS programming back in the stone age, i remember the simulator had a "monkey testing mode", where it would just generate GUI events randomly. it was actually quite useful for uncovering sporadic crashes.
EDIT: ha, i found a reference, they're called Gremlins
Calling them Gremlins is actually a reference to the British Royal Air Force in WWII. The airmen would blame 'gremlins' for mechanical issues or even problems during flight.
Famous children's author Roald Dahl popularized them after he left the RAF to become a writer. If you get a chance, read a biography of Roald Dahl. He truely was the most interesting man in the world. (WWII Ace Fighter Pilot, British Secret Agent, invented a brain-heart valve, married an Academy Award winning actress, pioneered a stroke recovery program and sold over 100 million books.)
The intended use for the Mac equivalent (described in the folklore.org story linked from another comment here) was to disable the quit command, so the test could be run for longer than it took the "monkey" to quit the application.
You might avoid doing certain write-esque operations if you know you are in test mode. For example, you wouldn't want to charge a credit card or update someone's facebook status.
I'd imagine there are occasional uses for it. You could use it to start the app with a particular state you wish to test with Monkey, while leaving it usable for human users.
It's meant to test some of the more extreme bounds of the system. Filling the event queue faster than the code can handle it, for example. Clicking in odd places at strange times and seeing unexpected behavior.
Monkey testing is going to expose things that you never anticipated in your test spec. It's a stress test.
Maybe. Sometimes no matter how hard you try test and production are significantly different. I wind up in a number of situations where any meaningful testing has to be done in production.
I'll assume that it's best to put in the check after you've found a bug. Perhaps something that the monkey triggers that isn't very high on your human-customer buglist.
Skype must to have a similarly named function in it. In skype chat window, when you press random keys on a keyboard in groups, the other side sees a cat icon in the chat. Kind of neat of them to think of that.
After Challenger blew up, NASA demanded that every shuttle vendor report the cumulative probability that their component of the system would lead to a loss of vehicle accident. NASA took all those probabilities and came up with their best guess of the probability of a loss of vehicle accident for each flight. While Feynman praised the software process for the shuttle, the software group still had to come up with their number. So the instructor said they took all their statistics from the (individual, unique) software loads for each historical flight, and included the failures from their loss of vehicle accidents.
"Say what? The shuttle software hasn't had any loss of vehicle accidents." Well, turns out it had. Each unique software load for each mission is tested and trained against for many months before it flies. Sometimes they fail, just not yet in actual flight. For example, apparently one time the shuttle crew was practicing launch aborts, where the launch is aborted just after clearing the pad and the orbiter lands like a glider. About the only crew member involved in that is the pilot. Everyone else is just strapped in being bored, and after a few hours of sitting still, the co-pilot got "frisky." During the launch phase, he randomly tapped some keys on his keypad and ... BOOM! Loss of vehicle accident.
Monkeys at work! I suppose it could be argued that with all the bumping around during the launch phase, a stray hand could accidentally "fuzz" that keypad.