Humans do the same thing; not working as intended some of the time. It's just that the failure modes for ML are different, and so we see them as ridiculous.
Well, if the result of the different failure modes means that a machine can't tell the difference between what obviously resembles a turtle and a rifle, or a cat and guacamole, then that's something that anyone who watched the video is better at. You can call it a different failure mode, but these are things no human would misclassify unless seriously ill, and being able to classify simple objects is important to our day to day lives.
Imagine some sort of Robocop deciding to neutralize someone for holding a turtle toy.