- Raw content is parsed into one or more PipelineDocument objects.
- Any number of intermediate stages operate on the document fields directly, or, in the case of specialized NLP tools, add annotations to a document.
- Finally, the PipelineDocument is sent to Solr for indexing.
Additional Resources
Available index pipeline stages are listed below:Document transformation
Document filtering and enrichment
Field transformation
- Date Parsing
- Field Mapping
- Filter Short Fields
- Find and Replace
- GeoIP Lookup
- Regex Field Extraction
- Regex Field Filter
- Regex Field Replacement
- Resolve Multivalued Fields
- Solr Dynamic Field Name Mapping