ImageNet is a manually-annotated database to train models. Google had the Image Search corpus internally too.
The advances in image and speech recognition of these times are due to innovations in deep learning by Google, Microsoft, NVidia, Stanford, U. Toronto, CMU, and many others. I didn't mean to imply it was all Google's doing. Rather that Apple wasn't there, which is why "just waiting for it to become feasible" sounds like apologetics.
Then there's also innovation in making a product out of the new capabilities, or integrating them into an existing product. I think dismissing that as "everybody was just waiting for the technology" is not realizing that, in hindsight, all products look obvious.
The advances in image and speech recognition of these times are due to innovations in deep learning by Google, Microsoft, NVidia, Stanford, U. Toronto, CMU, and many others. I didn't mean to imply it was all Google's doing. Rather that Apple wasn't there, which is why "just waiting for it to become feasible" sounds like apologetics.
Then there's also innovation in making a product out of the new capabilities, or integrating them into an existing product. I think dismissing that as "everybody was just waiting for the technology" is not realizing that, in hindsight, all products look obvious.