Hacker News new | past | comments | ask | show | jobs | submit login

It shouldn't take an intern too long to collect a representative set of Congress people and other high officials for training. Maintaining it would not be an undue burden. That would eliminate the false positive matches for all the unwanted celebs. Clearly Amazon's models aren't that great to begin with so there's little reason to stick with them.

Wrap it up into a simple native app and you can bypass the MMS BS. Even better, a sufficiently capable dev could integrate an opensource recognition library [1] to have it entirely implemented on the device.

[1] https://github.com/rudybrian/tuFace




Hi! I'm Jeremy, one of the developers.

We'll probably work on something like this for the next version. One reason it's harder than you think: We would have to buy / own rights to the photographs before we could use them to train -- most of those photos are owned by Getty or the AP. And our own photographs are perfectly lit and square, which made them awful for training face recognition.

The other hangup (which I didn't get to in the article) is having to add / remove people. New members are constantly being added and that's a maintenance burden for us. Amazon usually has the new member within a day or two. (Our team is very small and we have a lot of other responsibilities!)

But good points, definitely.


> We would have to buy / own rights to the photographs before we could use them to train -- most of those photos are owned by Getty or the AP.

I think your model would be covered by derivative art... unless you started selling the model itself.


"We would have to buy / own rights to the photographs before we could use them to train..."

Is this actually true?


In USA I don't think it is because the end use is transformative [1].

In UK it would be tortuous because it relies on Fair Use to temporarily store the images in order to extract the facial structure data. Fair Dealing is really draconian in comparison.

[1] https://www.lib.umn.edu/copyright/fairuse


Really doesn’t matter - the legal team at the NYT thinks it might be, and lawyers exist to tell people “you’d better not”.


And it's our job, as someone who knows what a computer is, to move forward with common sense if they're overreaching which, is their job.

They have every incentive to be as conservative in their advice as possible, and no incentives to "allow" risks. Doesn't increase their compensation any.


But...doesn't congress maintain some sort of api for available bio data - possibly including photos - to avoid the maintenance issue brought up in the piece? A quick google search shows propublica has sych an api, and it seems to have originally been developed by NYT... https://projects.propublica.org/api-docs/congress-api/


> We would have to buy / own rights to the photographs before we could use them to train

Please do elaborate on who's enforcing this mindset on you/your team.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: