Integrating with existing Solr instances

If you have already implemented Solr, as a standalone instance or as a SolrCloud cluster, you can add Fusion to your existing Solr installation and import your Solr collections into Fusion. Each Fusion collection can import one Solr collection.

  • If your existing Solr instance is running in SolrCloud mode, you can use Fusion’s UI to modify configuration files (such as schema.xml or solrconfig.xml) and create Solr collections.

  • If your existing Solr instance is running in standalone mode, you can still connect it to Fusion, but you won’t be able to create Solr collections (Solr cores) or modify configuration files with Fusion’s UI. Fusion can send documents to a standalone Solr instance and query the instance.

Prerequisites

  • Your Solr installation must contain one or more collections (cores).

  • In SolrCloud mode, Solr must be configured to use ZooKeeper.

Integrating Fusion with an existing Solr installation

Configuring an existing Solr installation from the Fusion UI

  1. Create a Fusion search cluster:

    1. In the Fusion UI, navigate to System > Solr Clusters and click New Solr Cluster.

    2. Enter this information:

      • a cluster ID of your choice

      • whether SolrCloud is enabled

      • the connect string (to tell Fusion how to connect to the cluster or instance)

        • For SolrCloud, this is the ZooKeeper connect string.

        • For standalone Solr, this is the URL of the Solr instance.

    3. Verify that the connection is working by clicking Cores in the new cluster and inspecting the contents.

  2. Create a Fusion collection that points to your Solr cluster and collection:

    1. In the UI, navigate to Collections and click Add a Collection.

    2. Enter a name for the new collection.

    3. Click Advanced.

    4. Select your Solr cluster from the dropdown.

    5. Enter the name of the Solr collection to import.

Configuring an existing Solr installation using the Fusion REST-API

Use the Search Cluster API to create a Solr cluster.

Then use the Collections API to create and configure a collection.

Sending Documents to Solr through Fusion

You can use the Fusion connectors to crawl documents and index them to your existing Solr installation. . Follow the steps above to create and configure a search cluster and a collection that points to Solr. . Define an index pipeline that ends with a Solr Indexer stage that sends the documents to Solr. . Use one of these methods to ingest your data: In the collection that points to your Solr collection, define a datasource using the connector of choice. Send prepared documents directly to the index pipeline for processing. See Pushing Documents to a Pipeline. ** It’s also possible to use another indexing process besides a connector, such as a script that sends documents through the index pipeline.

When documents are sent to Solr, a buffering solrServer is used. Buffering the updates reduces the number of HTTP requests made from Fusion to Solr, which can significantly affect processing time. For example, when processing simple documents, you should always try to buffer as many documents as possible to increase throughput. When processing complex documents, you should use small batch sizes. You should only turn buffering off if you want Fusion to catch and document indexing errors from Solr and you are using an older version of Solr.

Querying Solr via Fusion Requests

Indexed documents are stored in Solr indexes. You can query for these documents by using query pipelines. The query pipelines allow you to define your query parameters - such as how many records to return, the fields you’d like, how to structure facets, and so on. You also have the ability to add JavaScript to the response processing, and define landing pages or specific boost levels depending on the user’s query. See Query Pipelines.

If you prefer, you can also use the Solr API and SolrAdmin API to query Solr directly.