Remote Connectors

The Fusion connector architecture is designed to be scalable. Depending on whether the connector is a V1 or a V2 (SDK) connector, jobs can be scaled by adding new instances of just the connector. The fetching process for these connectors also supports distributed fetching, so that many instances can contribute to the same job.

At this time, Fusion 5.0.x does not support remote connectors.

SDK connectors can be hosted within Fusion Server, or can run remotely. In the hosted case, these connectors are cluster aware. This means that when a new instance of Fusion starts up, the connectors on other Fusion nodes become aware of the new connector, and vice versa. This makes scaling connector jobs simple.

In the remote case, a connector becomes a client of Fusion. This remote client runs a lightweight process and communicates to Fusion using an efficient messaging format. This option makes it possible to put the connector wherever the data lives. This can be done for performance reasons, or for security or access reasons.

The default SDK connector service is connectors-rpc. By default, connectors-rpc runs on port 8771. This service handles connector registration, configuration management, job management, and cluster coordination. Like other Fusion services, it also provides access to non-connector clients.

The connector client

Fusion comes with a connector client that remote connectors can use to communicate with Fusion. It is located at FUSION_HOME/apps/connectors/connectors-rpc/client/connector-plugin-client-{fusionVersion}.x-uberjar.jar.

To run the connector client, you must have a .zip file containing exactly one connector plugin. Visit our Connector Downloads page to obtain a copy of the available V2 connectors.

Basic connector client usage

To start a connector client, on the remote node (for example, the datasource), do the following:

  1. Copy the connector uberjar from Fusion Server onto the remote node. The connector uberjar is at the following location:

  2. On the remote node, run:

    java -jar path/to/uberjar/connector-plugin-client-{fusionVersion}-uberjar.jar path/to/connector/

Known Issues

  • Registering a plugin instance during crawl could result in errors. Only connect plugins when no jobs are running.

  • In order to connect a plugin from a remote instance, you are required to manually set the default.address value in Fusion. This host value is used with the property com.lucidworks.fusion.plugin.hosts. For example, where is the host value in the FUSION_HOME/conf/fusion.cors file:

java -Dcom.lucidworks.fusion.plugin.hosts= -jar path/to/uberjar/connector-plugin-client-{version}-uberjar.jar path/to/connector/