Hacker News new | past | comments | ask | show | jobs | submit login

Where I work we've been debating about this a lot. I work with log data from CDNs, so user IP addresses get ingested. We use that information and correlate it with geoip services to determine stuff like the ISP being used.

This is so we can evaluate CDN performance and also see how well ISPs are doing in serving content to the user. So it's essentially asking questions about network performance rather than at a macro level of individual users.

As far as IPs are concerned we don't care much after that, other than maybe the odd "how many unique IP addresses were served today" type queries.

We've talked about doing the secret/salt that is rotated periodically, but to be safe you would definitely need to ensure previous salts are destroyed, and not even let people view them or access them when they are live.




Wouldn't storing the first three octets of an IP address be enough for this kind of analysis? Or use the whois database and reduce the data to the first IP address of the network ?


I personally think just storing the autonomous system the IP originates from and never writing the IPs to disk at all would be advisable if the goal is purely which ISPs are delivering how many bytes to end users. Another benefit is the AS to IP mapping database is small enough to fit in memory without issue.


That's probably insufficient for the usecase. A single AS can advertise many different routes for different IP blocks that have dramatic geographic differences.


How are you going to ask user for consent to process their IP this way?


Consent is not the only basis for legally processing data. There is not enough information in the above comment to determine which basis this company has determined their processing falls under.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: