Looking for the old docs site? You can still view it for a limited time here.

Solr Connector

A Solr connector pulls documents from an external standalone Solr instance or SolrCloud cluster using Solr’s javabin response type and streaming response parser.

For Solr v4.7 and greater, cursorMark deep-paging is used. For earlier versions of Solr, standard paging (start+rows) is used.

The following Solr components and parameters can be configured:

  • collection/core (also allows default/empty core)

  • query (*:* by default)

  • filter queries

  • query parser

  • request handler (defaults to /select)

  • stored fields to retrieve

Also, since cursorMark deep paging should be used when possible:

  • sort spec (default: id asc)

This connector can be configured to store information about datasources and the data ingested in a ConnectorDB crawldb instance.

Limitations

  • Cannot do incremental crawls. (May be possible to do so in the future using source Solr docs' version field.)

  • Cannot do manual filtered deep paging.

  • Doesn’t check that both sort spec and field list contain uniqueKey field.

  • Cannot handle encrypted connection to Solr