Index pipeline stage configuration specifications
content
.
This not only flattens the document contents, it loses all information about the containing
elements in the document.
To process XML documents using an XML Transformation stage, the index pipeline must have as its
initial processing stage an Apache Tika Parser index stage which is configured to pass the
document through to the XML Transformation stage as raw XML, via the following configuration:
body
.
The pipeline must have a Field Mapping stage after the XML Transformation stage, before the Solr Indexer stage. The Field Mapping stage is used to remove the following fields from the document:
mappings
, for each mapping, the specification for the xpath
attribute must include the full path, i.e., the xpath
attribute will include the rootXPath
. See the example configuration below.\t
for the tab character. When entering configuration values in the API, use escaped characters, such as \\t
for the tab character.