Datasources Overview

A collection includes one or more datasources. A datasource is a configuration that manages the import and indexing of data into the collection.

The Index Workbench provides a development environment for creating, configuring, and testing a datasource configuration. Every datasource configuration includes the following:

  • Connector configuration, specifying the source and format of the incoming data.

  • Parser configuration, describing a series of conditional parsing stages to transform the incoming data into PipelineDocument objects.

  • Index pipeline configuration, consisting of stages that transform PipelineDocument objects into Solr documents to be indexed.


Collections and datasources can also be managed through the REST API.

In some cases it may make sense to bypass the connectors and use other ingest methods for your data.