OpenNLP NER Extraction pipeline stage
The OpenNLP NER Extraction index pipeline stage performs only Named Entity Recognition (NER). This stage is available in all versions of Fusion AI. For additional NLP functionality, use the NLP Annotator pipeline stages, available in Fusion AI versions 4.2 and later. See below for details.NLP Annotator pipeline stages (4.2.0 and later)
Fusion AI 4.2 introduced the NLP Annotator as both an index pipeline stage and a query pipeline stage. The NLP Annotator performs a variety of fundamental NLP tasks: If configured in an index pipeline, the NLP annotator performs selected NLP tasks on raw document content during the indexing process (see more details here). If configured in a query pipeline, the NLP annotator performs selected NLP tasks on the query text content (see more details here).NLP features
Fusion’s NLP Annotator pipeline stages include the NLP features described below.Sentence detection
Sentence detection is the process of analyzing text to determine sentence boundaries. It is typically the first step taken when performing any kind of natural language processing on a document. Commonly, a sentence is indexed as a multi-value field that can be used for various purposes, as in these examples:- Relevancy: Boost documents whose first sentence matches the query terms.
- Snippets: When presenting the search results, display the first few sentences of each document.
Named Entity Recognition (NER)
Named Entity Recognition is a popular technique used in information extraction to identify and segment the named entities and classify or categorize them under these predefined classes:- person
- organization
- location
