Skip to main content
  • Latest version: v1.0.0
  • Compatible with Fusion version: 5.9.0 and later
The Solr Pro connector reindexes content that is managed in an external Solr core or collection and make it searchable within Fusion. Use this connector to migrate content from an external Solr instance to Fusion for a unified search across multiple Solr-based apps and to enable the use of Fusion’s query pipelines and analytics on Solr-managed data.
The Solr Pro connector does not support reindexing content in your existing Fusion environment.

What are Pro connectors?

Pro connectors are built on the same framework as V2 connectors but meet higher internal standards for stability, reliability, and production readiness. If you’re currently using other V2 connectors, the process for installing and upgrading a Pro connector remains the same.

Solr Pro connector features

The following Solr components and parameters can be configured:
  • collection/core (also allows default/empty core)
  • query (*:* by default)
  • filter queries
  • query parser
  • request handler (defaults to /select)
  • stored fields to retrieve
Also, since cursorMark deep paging should be used when possible:
  • sort spec (default: id asc)
This connector can be configured to store information about datasources and the data ingested in a ConnectorDB crawldb instance.

Comparison to the Solr V1 connector

The Solr Pro connector retains the functionality from the Solr V1 connector. Two new fields have been added: querytimeoutms - the individual query timeout, set in milliseconds. The default value is 30000, or 30 seconds. useCursorMark - Enable or disable the cursor mark, which is used for pagination. This field is enabled by default. If this field is not enabled, then offset pagination is used. The existing fields and values in the V1 connector carry over to the Pro connector. Additionally, some advanced connection settings and fetch settings that were hard-coded into the V1 connector are now exposed and configurable.
To transform data after it leaves the Solr Pro connector, use the Index Pipeline stages.

Migrate from the Solr V1 connector

We strongly encourage migrating from the Solr V1 connector to the Solr Pro connector. If you are currently using the Solr V1 connector and want to migrate your existing connector settings, consult the Solr migration guide in the Migration Guide series for instructions.

Prerequisites

Before using the Solr Pro connector, ensure that you are using Solr 8 or later or later for your Solr environment. Your Fusion environment must be using Fusion 5.9.0 or later.

Confirm Solr availability

You need to have an accessible external standalone Apache Solr or SolrCloud deployment, as Fusion must be able to reach Solr over HTTP/S. For SolrCloud, make sure the Solr Node endpoint is reachable from Fusion and not just Zookeeper. The connector uses Solr’s /select endpoint to pull documents. Ensure /select is enabled and not blocked by security rules or proxies. The Solr instance should return standard response.docs JSON.
  • For incremental crawling, enable delta indexing.
    • Solr documents must include a timestamp field such as last_modified with a format of ISO-8601 or Solr-readable datetime.
      • Example: "timestampField": "last_modified"
  • To enforce document-level security, your Solr docs must include user/group fields mapped to Fusion’s access control list (ACL) fields like _lw_acl_read.

Write valid queries

You must have a valid Solr query to extract data. When writing queries or filter query, the prefixes of q and fq are already appended.
  • For queries, use *:* to fetch all documents.
  • You can use filters such as type:product.
  • Test your query in the Solr Admin UI before using it in your datasource settings.
To use field mapping, know which Solr fields to extract:
  • Unique ID field such as id
  • Optional timestamp field for incremental crawl
  • Any custom metadata fields
The Solr Pro connector validates queries before you start a job by executing the query against your Solr environment in a controlled way. If the query generates an error from the server in the validation process, the connector includes detailed and actionable error messages so you can identify and fix errors quickly.

Authentication

Currently, the Solr Pro connector supports connections to unencrypted, public Solr endpoints, so no authentication is needed.

Troubleshooting

This section outlines some common troubleshooting topics.

Query timeout exceeded

If you consistently see the “Query timeout exceeded” error in the Solr connector, your query timeout is likely too low. Increase the Query Timeout value or reduce the Batch Size so the connector can process a batch in a timely manner.

Cursor mark not supported

If the Solr connector returns the “Cursor mark not supported” error, you are using a Solr environment earlier than Solr 4.7. Cursor mark pagination is not supported in Solr versions earlier than Solr 4.7. Set Use cursor mark to false, and the Solr connector will use offset pagination.
The Solr Pro connector has been tested on Solr 8 and 9, which support cursor mark pagination.

Slow ingestion

If the Solr connector is ingesting documents slowly, there are several fields to check. If you are indexing many small documents, increase the value of Batch Size to process more documents in one match. Alternately, if you are increasing large documents, decrease this value. You can also edit the value of Max connections per host to a value that suits your needs.

Configuration

When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.