Hacker News new | past | comments | ask | show | jobs | submit login

It's not anti-scientific at all. Looking for correlations to uncover causations in data is important, and it can certainly be the beginning in a scientific effort to build a model of the world (especially, I would argue but I guess this gets rather philosophical, when it's motivated by a belief in an underlying mechanic that drives the correlated observations, perhaps developed so fully as to be considered a 'hypothesis'). Finding natural patterns of associations in data is a totally reasonable starting point for finding a causation _when a causation exists_.

Feeding arbitrary sequences of samples into dbscan and deciding that because dbscan produces output, there is causation (or, that there is an underlying phenomena that can be captured in some type of model), is ridiculous. And there are tons and tons and tons of natural phenomena that will be happy to produce clumpable inputs all the time, with no underlying behavior (including noise).

I'm sure there are also tons of interesting models of human behavior that you can make out of some set of observations of human behavior via clustering. But just because you feed some data to an unsupervised learning algorithm and it discovers features doesn't imply that those features have any useful descriptive power to help us make sense of the natural world. THAT's anti-scientific thinking.




> Finding natural patterns of associations in data is a totally reasonable starting point for finding a causation _when a causation exists_.

So if a causation turns out not to exist then finding natural patterns was an unreasonable starting point? What would have been a reasonable starting point in that circumstance?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: