Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user. UIMA is a framework and SDK for developing such applications. An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at. UIMA enables such an application to be decomposed into components, for example language identification -> language specific segmentation -> sentence boundary detection -> entity detection (person/place names etc.). Each component must implement interfaces defined by the framework and must provide self-describing metadata via XML descriptor files. The framework manages these components and the data flow between them. UIMA additionally provides capabilities to wrap components as network services, and can scale to very large volumes by replicating processing pipelines over a cluster of networked nodes.

TLA-team: A complex framework for NLP production chains that requires strong typing, i.e. the information that is being exchanged between the NLP tools need to be specified. The effort required does not make it easy to use for normal research work.

http://incubator.apache.org/uima/

Facebooktwittergoogle_pluslinkedin