Compatible with Fusion version: 4.0.0 through 5.12.0
The Solr V1 connector pulls documents from an external standalone Apache Solr instance or Apache SolrCloud cluster using Solr’s javabin response type and streaming response parser. The connector is designed to reindex content already managed in a Solr core or collection and make it searchable within Fusion. Use this connector to migrate content from Solr to Fusion for a unified search across multiple Solr-based apps and to enable the use of Fusion’s query pipelines and analytics on Solr-managed data.
ImportantV1 deprecation and removal noticeStarting in Fusion 5.12.0, all V1 connectors are deprecated. This means they are no longer being actively developed and will be removed in Fusion 5.13.0.The replacement for this connector is in active development at this time and will be released at a future date.If you are using this connector, you must migrate to the replacement connector or a supported alternative before upgrading to Fusion 5.13.0. We recommend migrating to the replacement connector as soon as possible to avoid any disruption to your workflows.
Connector flow For Solr v4.7 and greater, cursorMark deep-paging is used. For earlier versions of Solr, standard paging (start+rows) is used. The following Solr components and parameters can be configured:
  • collection/core (also allows default/empty core)
  • query (*:* by default)
  • filter queries
  • query parser
  • request handler (defaults to /select)
  • stored fields to retrieve
Also, since cursorMark deep paging should be used when possible:
  • sort spec (default: id asc)
This connector can be configured to store information about datasources and the data ingested in a ConnectorDB crawldb instance. There are some limitations to this connector:
  • Cannot do incremental crawls. (May be possible to do so in the future using source Solr docs’ version field.)
  • Cannot do manual filtered deep paging.
  • Does not check that both sort spec and field list contain uniqueKey field.
  • Cannot handle encrypted connection to Solr.

Prerequisites

Perform these prerequisites to ensure the connector can reliably access, crawl, and index your data. Proper setup helps avoid configuration or permission errors, so use the following guidelines to keep your content available for discovery and search in Fusion.

Confirm Solr availability

  • You need to have an accessible Apache Solr or SolrCloud deployment, as Fusion must be able to reach Solr over HTTP/S.
    • For SolrCloud, make sure the Solr Node endpoint is reachable from Fusion and not just Zookeeper.
  • The connector uses Solr’s /select endpoint to pull documents.
    • Ensure /select is enabled and not blocked by security rules or proxies.
    • The Solr instance should return standard response.docs JSON.
  • For incremental crawling, enable delta indexing.
    • Solr documents must include a timestamp field such as last_modified with a format of ISO-8601 or Solr-readable datetime.
      • Example: "timestampField": "last_modified"
  • If you want to enforce document-level security, your Solr docs must include user/group fields mapped to Fusion’s ACL fields like _lw_acl_read.

Decide on queries

  • You must have a valid Solr query to extract data:
    • Use q=*:* to fetch all docs.
    • You can use filters like fq=type:product.
    • Test your query in Solr Admin UI before using it in Fusion.
  • For field mapping, know which Solr fields to extract:
    • Unique ID field such as id
    • Optional timestamp field for incremental crawl
    • Any custom metadata fields

Authentication

The Solr V1 connector only works with Solr endpoints that are public or unencrypted, so no authentication is needed.

Configuration

When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.