Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I would presume a highly-skilled fraudster could just spin up a new VM, for instance, and evade detection that way.

From my experience building fraud detection systems at Eventbrite most fraudsters are not that sophisticated -- fraudsters usually go for the lowest-hanging fruit and as such are looking for systems to defraud that have the highest payout for the lowest effort. Because there is always some level of uncertainty (getting detected, the credit card not working, etc.) fraudsters often favor techniques that allow them to try as many websites/cards as possible. This is especially true for Sift Science's customers who tend to be more small to mid-size companies; big companies for whom fraud detection is critical will tend to have their own in-house solution.

In addition this is usually only one signal -- ideally you want your algorithm to be able to detect first-time fraudsters too, so the other signals should be able to stand on their own.

One caveat though, the reason why multiple accounts is a signal of fraud is because fraudsters tend to be repeat offenders, and will keep defrauding the same website if their previous attempts worked. But now that they're facing a fraud detection algorithms that detects repeat offenders more easily, it's highly possible they will adapt their behavior.

This is a signal that will fade out in strength over time, and one of the dangers of pooling together data from multiple websites as in this blog post (but hopefully this is taken into account in their algorithms) is that the strength of the signal may be skewed by the proportion of new users of their platform (who will have a higher proportion of unsophisticated fraudsters by nature of they not having a fraud detection system previously).

This is why whenever you are building a fraud detection algorithm (or any machine learning algorithm that's consumer facing) understanding the story behind the data is very important, and not just looking at the numbers.



I can't thank you enough, this is some incredibly valuable advice that you've given me and everyone else on HN.

I'm trying to log and look for varied signals, and have a few interesting ones that pick up the lazy and not-so-lazy fraudsters.

I'm going to be extra careful to ensure that we keep "understanding the story behind the data."

(that one has the added benefit of feeling obvious in hindsight, and so once again, incredibly valuable)

Thanks again!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: