Hacker News new | past | comments | ask | show | jobs | submit login

I do understand your point. There are definately tons of hard data science problems, which are simply not suitable for the predictive query kind of approach.

At the same time there are tons of ML problems e.g. in process automation or user interaction, which have extremely strong patterns and very easy to treat with sophisticated enough ML model.

Regarding your list of items. Feature engineering is greatly managed by the user selecting relevant facts in the query, by analyzers, by MDL based feature learning and by information theory based feature selection. I feel this approach is pretty robust for many problems, all thought not complete. There are special queries like $on for making conditional variables of form A|B, and $numeric to deal numeric data, that can be used manually.

Model debugging can be partly done with $why explanations, that are easy to create with the Bayesian approach. I feel that model debugging has been good enough.

Latency requirements and constant updates are more about software/database engineering and they are solvable, but right now we do recommend batch updates and applications, that can deal with sparsely occurring multi second latency. And OFC if you have limited data sets (less than 100k), there shouldn't be such problems.

I feel that all the problems you listed are solvable, but they are of course hard problems and we are still on the roadmap on fully solving those issues for larger set of applications. For many applications (like RPA, internal tools, analytics) these are not real issues, while the benefits (easiness, speed) are extremely concrete and relevant.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: