Hacker News new | past | comments | ask | show | jobs | submit login

gensim is one of the best libraries for word vectors and summarization. For parsing and NER, Stanford CoreNLP works best in my experience.



Well, a model you fine tune to your specific corpus/domain works even (in fact: much) better... And gensim there gives you the tools to build the best possible embeddings.

But you do need a use case and an economic reward for the substantial increase in cost than a pre-trained, vanilla, off-the-shelf parser (model) can give you. Yet, if your domain is technical enough (pharma, finance, law, ... - essentially, all but parsing news, blogs, and tweets...) it might be the only way to get a NLP system that really works.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: