First, lets examine what we know. Further objects will exhibit less relative movement then closer objects when looking through a slowly moving video camera (parallax motion). We can reconstruct 3d scenes using more than one image of the same scene from slightly different angles.
I imagine your question is for single images rather than multiple images of the same scene. Thus parallax motion will not help us. However, unless the images are taken using a tripod, there will be some kind of micro blur introduced by pushing the "take picture" button on your camera. These blurs can be analyzed. Using the same principal as above, further objects will be less blurred due to the motion. This can be used to reconstruct a 3d scene from a single image.
Next, we will likely know the characteristics of our lens and the focus/zoom settings that were used to take the image. Using this, plus the 3d scene from above, we should be able to calculate the distance to any object (recognized or not).
(disclosure: I am not an expert in this and do not know if this would work)
Thanks! Using the accidental motion of the photographer is extremely cool. However, for my purposes I have video that is taken from a camera on a tripod, so I don't think that method will work.
I don't mean to be a downer, but this isn't really novel. If you know the size of the object you are looking for, you know you can reliably find it in an image, and you understand your imaging device, then determining its distance from the sensor is extremely straightforward.
I suppose my point was that there is no shortage of information on the method either. For example, searching for "how to measure distance from camera" yields many results detailing the same method. Anyway, it doesn't really matter, it's a well written post, just nothing terribly interesting or new.
> in general, determining the distance from a camera to a marker is actually a very well studied problem in the computer vision/image processing space.
Agreed, it's certainly not novel. Although I personally find it interesting to see the source code behind these solutions and how the pieces are glued together.