I would add that your chances of having a proxy node increase by 1% with each free app you install these days. We catch them easily at visitorquery.com but the residential proxy business in rampant and probably half are infected devices, android TVs, routers and, ofc, mobile apps.
Because it's the single most falsifiable piece of information you would find on ANY "how to scrape for dummies" article out there. They all start with changing your UA.
Sure, but the article is about a bot that expressly identifies itself in the user agent and its user agent name contains a sentence suggesting you block its ip if you don’t like it. Since it uses at least 74 ips, blocking its user agent seems like a fine idea.
What exactly do you want to train on a falsifiable piece of info? We do something like this at https://visitorquery.com in order to detect HTTP proxies and VPNs but the UA is very unreliable. I guess you could detect based on multiple pieces with UA being one of them where one UA must have x, y, z or where x cannot be found on one UA. Most of the info is generated tho.
I think I know what you're talking about because I ran into this too. In defense of Supabase, you can still use transactions in other ways. Transactions through the client are messy and not easily supported by PostgREST.
The GitHub issue here sums up the conversation about this:
Regardless of Hacker News's thoughts on MCP servers, there is a cohort of users that are finding them to be immensely useful. Myself included. It doesn't excuse the thought processes around security; I'm just saying that LLMs are here and this is not going away.
Why do you need eBPF for it? Why is IP filtering and header/cookie analysis not enough? What is shopping cart fraud? What is your false positive and false negative rate?
Shameless plug but I have a client that got rid of almost 90% by blocking residential proxies or HTTP proxies in general using our service [1]. I tend to think people try some measures that are very hard to maintain going for behavioural data and other indicators where all fraud sits on this L1 being a proxy or a vpn.
Agreed - I'm pretty skeptical of invasive behavioral data like mouse movements. It feels like a popular meme from an earlier time, jiggle your mouse more before clicking the CAPTCHA checkbox, but in practice it's not a very high-value signal anymore (especially with the rise of mobile). TLS fingerprinting is a significantly more useful signal for us at Stytch.
How did they measure the number of legitimate customers that they lost because they blocked those IPs? It's not enough to estimate the change in unwanted traffic, you also need to estimate the change in desirable traffic.
https://visitorquery.com - my startup. I'm curious if they use proxies or not. Datacenter or residential, my service can detect them. You have a free plan which should allow you to have a better understanding of your traffic, at least from this perspective. Shenanigans with payment gateways usually involve proxies so I'm almost certain you can use it to detect > block the abusers before they reach the checkout page.
Maybe that's why people don't bother targeting Linux. Look at your comment. Why are people so easily offended these days by things that are out here, doing no harm to anyone. Like, you could just hit the back button and go on with your day. Why do you need to be so offended? If this is the sort of stuff that triggers you I can't imagine how your day looks like.
reply