How To
Documentation
    Learn More

      Search Cluster API

      API Objective: Connect Fusion to other Zookeeper-managed Solr clusters.

      The search cluster API allows users to connect Fusion with any existing Solr instances in a Zookeeper-managed cluster.

      Cluster operations are only supported when connecting through Zookeeper.

      Once the Solr cluster is registered with Fusion, requests can be proxied through Fusion to it. The possible requests include search requests, but they can also be content indexing requests, such as the content crawled with a connector.

      Once the searchCluster has been configured, the user can create Fusion collections that refer to the Solr collections that have been previously defined.

      Background

      Solr has three different approaches on how you can control visibility of new documents in search:

      • You can commit manually

      • You can rely on Solr’s autoCommit setting

      • You can specify commitWithin when adding documents

      Fusion uses commitWithin to avoid relying on specific Solr side configurations. Fusion controls commitWithin on a per-collection basis so you can have multiple collections with different commit frequencies (for example, product documents can be committed more often than signals).

      Global setting for commitWithin:

      curl http://localhost:8765/api/v1/configurations/com.lucidworks.apollo.solr.commitWithin
      10000

      com.lucidworks.apollo.solr.commitWithin is a global configuration property that defines default commitWithin for all documents added through Fusion. Every time you create a new collection in Fusion, per-collection commitWithin is initialised as the global default.

      Per-collection setting: You can either specify this property when creating collection or update it with PUT later.

      # create collection without specifying commitWithin
      sh> curl -H 'Content-type: application/json' -X POST 'http://localhost:8765/api/v1/collections' -d '{"id" : "test"}'
      {
        "id" : "test",
        ...
        "commitWithin" : 10000,
        ...
      }
      
      # create collection and specify non default value
      sh> curl -H 'Content-type: application/json' -X POST 'http://localhost:8765/api/v1/collections' -d '{"id" : "test2", "commitWithin": 20000}'
      {
        "id" : "test2",
        ...
        "commitWithin" : 20000,
        ...
      }
      
      # update commitWithin at a runtime
      sh> curl -H 'Content-type: application/json' -X PUT 'http://localhost:8765/api/v1/collections/test' -d '
      {
        "id" : "test",
        "createdAt" : "2015-01-07T17:44:47.396Z",
        "searchClusterId" : "default",
        "commitWithin" : 20000,
        "solrParams" : {
          "name" : "test",
          "numShards" : 1,
          "replicationFactor" : 1
        },
        "type" : "DATA",
        "metadata" : { }
      }'

      Search Cluster Definition Properties

      Property Description

      id
      Required

      The ID of the search cluster. This is only required when creating a new cluster definition with a POST request.

      connectString
      Required

      The string to use to connect to the existing Solr cluster or standalone instance.

      If the existing Solr is running in SolrCloud mode, use the connect string for the ZooKeeper ensemble.

      If the existing Solr is running as a standalone instance, use the full URL for the Solr instance.

      cloud
      Required

      Defines if the "cluster" being defined is a SolrCloud cluster (true) or a standalone Solr instance (false).

      bufferFlushInterval
      Optional

      Defines how often to flush the buffer when writing to this cluster. If not defined, the system will default to 1000 milliseconds.

      bufferSize
      Optional

      Defines the size of the buffer. If not defined, the system will default to 100 items in the buffer.

      concurrency
      Optional

      Defines the maximum number of concurrent /parallel requests to Solr servers when Fusion index pipeline Solr Indexer stage has property bufferDocsForSolr set to true.

      zkClientTimeout
      Optional

      The maximum amount of time to wait when communicating with the ZooKeeper ensemble for a SolrCloud instance.

      zkConnectTimeout
      Optional

      The maximum amount of time to wait when trying to connect to the ZooKeeper ensemble for a SolrCloud instance.

      Examples

      Create a new search cluster that is an existing SolrCloud cluster:

      REQUEST

      curl -u user:pass -X POST -H 'Content-type: application/json' -d '{"id":"mySolrCluster", "connectString":"10.0.1.6:5001,10.0.1.6:5002,10.0.1.6:5003", "cloud":true}' http://fusion-host:6764/api/searchCluster

      RESPONSE

      {
        "id" : "mySolrCluster",
        "connectString" : "10.0.1.6:5001,10.0.1.6:5002,10.0.1.6:5003",
        "cloud" : true,
      }

      Create a 'cluster' that is a standalone Solr instance:

      REQUEST

      curl -u user:pass -X POST -H 'Content-type: application/json' -d '{"id":"myOtherSolrCluster", "connectString":"http://fusion-host:8983/solr", "cloud":false}' http://fusion-host:6764/api/searchCluster

      RESPONSE

      {
        "id" : "myOtherSolrCluster",
        "connectString" : "http://fusion-host:8983/solr",
        "cloud" : false,
      }

      Show the status of each node of 'mySolrCluster':

      REQUEST

      curl http://fusion-host:6764/api/searchCluster/mySolrCluster/nodes

      RESPONSE

      [ {
        "name" : "10.0.1.11:7574_solr",
        "baseUrl" : "http://10.0.1.11:7574/solr",
        "state" : "active"
      }, {
        "name" : "10.0.1.8:7574_solr",
        "baseUrl" : "http://10.0.1.8:7574/solr",
        "state" : "active"
      } ]

      Show the system information for one named node:

      REQUEST

      curl http://10.0.1.8:6764/api/searchCluster/mySolrCluster/systemInfo?nodeName=10.0.1.8:7574_solr

      RESPONSE

      {
        "10.0.1.8:7574_solr" : {
          "mode" : "solrcloud",
          "lucene" : {
            "solr-spec-version" : "4.8.0",
            "lucene-spec-version" : "4.8.0"
          },
          "jvm" : {
            "version" : "1.8.0_121 25.121-b13",
            "name" : "Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
            "processors" : 4,
            "memory" : {
              "raw" : {
                "free" : 66736272,
                "total" : 204800000,
                "max" : 204800000,
                "used" : 138063728,
                "used%" : 67.4139296875
              }
            }
          },
          "system" : {
            "name" : "Mac OS X",
            "version" : "10.9.3",
            "arch" : "x86_64",
            "systemLoadAverage" : 2.130859375,
            "committedVirtualMemorySize" : 2963378176,
            "freePhysicalMemorySize" : 9321914368,
            "freeSwapSpaceSize" : 1073741824,
            "processCpuTime" : 313176000000,
            "totalPhysicalMemorySize" : 17179869184,
            "totalSwapSpaceSize" : 1073741824,
            "openFileDescriptorCount" : 208,
            "maxFileDescriptorCount" : 10240,
            "uname" : "Darwin MacMini.local 13.2.0 Darwin Kernel Version 13.2.0: Thu Apr 17 23:03:13 PDT 2014; root:xnu-2422.100.13~1/RELEASE_X86_64 x86_64\n",
            "uptime" : "15:48  up 3 days,  7:08, 7 users, load averages: 2.13 2.01 1.91\n"
          }
        }
      }
      Loading API specification...