The SolrXML connector indexes XML files formatted according to Solr’s XML structure.
Compatible with Fusion version: 4.0.0 through 4.2.6
Deprecation and removal noticeThis connector is deprecated as of Fusion 4.2 and is removed or expected to be removed as of Fusion 5.0. Use the Solr V1 connector instead.For more information about deprecations and removals, including possible alternatives, see Deprecations and Removals.
The SolrXML connector indexes XML files formatted according to Solr’s XML structure.
It is not a generic XML file crawler; it can only index SolrXML-formatted documents.Per the Solr standard, all XML files must include the <add> tag in order for the documents to be added to the Fusion index.
As described in the Solr Reference Guide section on using Solr’s updateHandlers, an XML document formatted for Solr must conform to a very specific structure. There are three general elements that are used:
<add> introduces one or more documents to be added to the index.
<doc> introduces the fields that make up a single document.
<field> defines the content for each field of the document.
For example, this is very simple XML including only one document:
Copy
<add> <doc> <field name="id">doc1</field> <field name="title">My Solr Document</field> <field name="body">This is the body of my document.</field> </doc></add>
The fields can be any field that is defined in your schema, or you can use dynamic field rules to create fields during indexing.The elements can take some attributes to define document overwrites, commit rules and field or document boosts. See the Solr Reference Guide section on XML-formatted updates for more details.
When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.