Blog‎ > ‎

Information Extraction

posted Oct 9, 2013, 6:44 PM by James Kraemer   [ updated Oct 9, 2013, 6:44 PM ]
Data Intelligence has added exciting new Information Extraction features to the Entity Analytical Platform. The features were accomplished through the integration of the best open-source libraries including the Apache Tika toolkit, Apache OpenNLP, and the Apache UIMA project.

Analysts can run text and metadata extraction from source documents and produce lists of recommended entities. The information extracted from source documents is used to nominate new entities and perform entity analytics over the repository.

To accomplish recommendations, Data Intelligence has released the new Data Intelligence Entity Recommendation Server. The Recommendation Server runs as a separate application that can receive REST requests and return recommendation results for use by the Entity Anayltical Platform. 

Different recommendation engines can be plugged into the Recommendation Server using UIMA PEAR files. The OpenNLP Named Entity Annotator is loaded by default into the Recommendation Server, which offers excellent out-of-the-box results. Different engines can be trained for various intelligence missions such as finance, military or cyber. Data Intelligence also provides the same hooks for integrating with separate UIMA servers, which means there is no vendor lock-in for entity extraction.