Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text that supports the common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution these tasks are usually required to build more advanced text processing services and includes maximum entropy and perceptron based machine learning.
OpenNLP has a nice and easy to understand API. The library is usable as a stand alone command line tool but can also be used within products which makes it easy to test new ideas on the CLI and the transition to productionizing a prototype is quick.
What do you dislike?
The project has been abandoned and is no longer up to date. There's also no indication in any parts of the API as to which functions and classes are thread safe versus not. I wish there was more example code in the documentation, and the models that have been trained that are available for download are from very long ago.
Recommendations to others considering the product
The documentation is lack luster. You will be doing a lot of your own digging.
What business problems are you solving with the product? What benefits have you realized?
Consistent tokenization and named entity recognition have been useful.
* We monitor all openNLP reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. Validated reviews require the user to submit a screenshot of the product containing their user ID, in order to verify a user is an actual user of the product.