Hacker News new | past | comments | ask | show | jobs | submit login

I'm really happy to see more people moving into this space.

I've used a number of different systems (openCalais, AlchemyAPI, Zemanta...) in a variety of projects (Sentiment analysis, document classification...), and what I've found thus far is that while each system works extremely well within some restricted application classes, none come close to being general purpose APIs for the myriad applications developers try to throw at them.

A couple of pain points I've encountered are requiring a larger than expected corpus to generate meaningful data based on overly broad scope of the platform's analysis, or the lack of ability to apply negative signals from external sources. I find there tends to remain a large quantity of logic sitting rather redundantly on the application end to post-filter what's generated.

I don't pretend to understand the level of complexity involved or what's being worked on currently (not an NLP guy), but I do think there's a huge space to create publicly available text mining which can more effectively be applied to narrow domains.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: