Maybe we should we whitelisting for analytics. Instead of perpetually trying to identify bots, Pick the 20% of your users who you are pretty sure are actually humans. Nielsen-style, if you will.
Maybe two cohorts. The people we're really sure about, and the ones we're pretty sure about. If they diverge, ask why.
In general it's quite easy to filter out bots and crawlers from your basic access logs, as most bots and crawlers will identify themselves as such.
If you're running anything with an API, then unless somethings horribly wrong it's even easier: look at the number of requests being made to an API endpoint and spot check a few of the user identifiers (tokens, keys, whatever you're using) to see the variety of users.
All of this is assuming you're trying to merely investigate the volume of use of a feature, not trying to diagnose demographics. If you're trying to extract more fine-grained detail, I don't have as many answers; I hope others will chime in with constructive ways to get things like geographic demographics via server logs.