Connectors Configuration Reference

Connectors are the conduit between Fusion and your external data sources. Connectors retrieve your data and import it into Fusion Server.

Initially, Fusion offered classic connectors, or V1, connectors. V1 connectors were developed with general-purpose crawler framework called Anda, created by Lucidworks. Anda helps simplify and streamline crawler development, reducing the task of developing a new crawler to gain access to your data.

As of version 4.1.0, Fusion began offering V2 connectors, which utilize a Java SDK framework.

The V2 platform version is included by default for all connectors it is available for. Currently, three connectors are offered with in the V2 platform version: Local Filesystem, OneDrive, and Sitecore. The Local Filesystem connector is also offered in V1 upon request. OneDrive and Sitecore connectors are only offered in V2, having been developed after the V2 platform version became available.

In addition to the features and benefits provided by V1 connectors, V2 connectors offer:

  • Security Access-control Lists (ACL) which are separate from content

  • Fusion connectors support SSL/TLS security

  • Improved scalability, depending on the connector

    • Jobs can be scaled by simply adding instances of the connector

    • The fetching process supports distributed fetching, allowing many instances to contribute to the same job

  • Connectors can be hosted within Fusion, or can run remotely

    • Hosted connectors are cluster-aware, allowing connectors on separate notes to become of new connectors

    • Remote connectors become clients of Fusion and run a lightweight process and communicate to Fusion using an efficient messaging format

    • Remote connectors can be located wherever the data is located, which might be required for performance or security and access

  • Google’s fast and efficient framework gRPC is used as the underlying client/server technology

    • Increased flexibility in the way services and their methods are defined

    • HTTP/2 based transport

    • Efficient serialization format for data handling (protocol buffers)

    • Allows bi-directional/multiplexed stream

See Connectors for information about built-in connectors and how to install and configure connectors.

Visit our Connector Resources page to obtain a copy of any of the available connectors. Please note that the JDBC, File Upload, Local Filesystem and Web connectors are already installed in the Fusion package.

Database connectors

Name

Description

Source Compatibility

Platform Version

Security Trimming

Couchbase

The Couchbase connector uses the Cross-Datacenter Replication (XDCR) feature of Couchbase to retrieve data stored in Couchbase continuously in real-time.

Couchbase Server 2.5.1 Enterprise Edition

V1

No

JDBC

The JDBC connector fetches documents from a relational database via SQL queries. Under the hood, this connector implements the Solr DataImportHandler (DIH) plugin.

Any JDBC compliant Database

V1

No

MongoDB

Retrieve data from a MongoDB instance.

MongoDB 2.6, 3.0, 3.2, 3.4

V1

No

Filesystem connectors

Name

Description

Source Compatibility

Platform Version

Security Trimming

Box.com

The Box connector retrieves data from a Box.com cloud-based data repository. To fetch content from multiple Box users, you must create a Box app that uses OAuth 2.0 with JWT server authentication. For limited testing using a single user account, you can create a Box app that uses Standard OAuth 2.0 authentication.

N/A

V1

Yes

File Upload

The File Upload connector provides a convenient way to quickly ingest data from your local filesystem. It’s available in the Quickstart wizard in addition to the Index Workbench and the Datasources page.

N/A

V1

No

FTP

Retrieve documents using the File Transfer Protocol (FTP).

N/A

V1

No

Google Drive

The Google Drive connector is used to index the documents in a Google Drive account.

Google V3 API

V1

Yes

HDFS

Hadoop Distributed File System (HDFS). It traverses the Hadoop file system as it would a regular Unix filesystem.

2.7.1

V1

No

Local Filesystem V2

Local Filesystem V1 (deprecated)

This connector traverses a network file system (NFS), where a shared drive is mounted to the same location on all hosts in the cluster that are running this connector.

All

V1

V2

No

OneDrive

OneDrive is a file hosting service that is part of the Microsoft Office Online services. The Fusion OneDrive connector crawls a OneDrive for Business instance and retrieves data from it for indexing within Fusion.

All

V2

Yes

S3

The S3 connector can access AWS S3 buckets in native format.

All

V1

No

SolrXML

The SolrXML connector indexes XML files formatted according to Solr’s XML structure. It is not a generic XML file crawler; it can only index SolrXML-formatted documents.

All

V1

No

Windows Share (deprecated)

The Windows Share connector can access content in a Windows Share or Server Message Block (SMB)/Common Internet File System (CIFS) filesystem.

Note
This connector is deprecated as of Fusion Server 4.0.2.

SMB 1 protocol

V1

Yes

Windows Share SMB2/3

The Windows Share connector can access content in a Windows Share or Server Message Block (SMB 2 and 3 protocols)/Common Internet File System (CIFS) filesystem. Available for Fusion Server version 4.0.2 and later.

SMB 2, 3 protocols

V1

Yes

Hadoop cluster connectors

Name

Description

Source Compatibility

Platform Version

Security Trimming

Apache Hadoop 2

The Apache Hadoop 2 Connector is a MapReduce-enabled crawler that is compatible with Apache Hadoop v2.x.

2.x

V1

No

Cloudera

The Cloudera Connector is a MapReduce-enabled crawler that is compatible with Cloudera CDH v4.x and v5.x.

4.x, 5.x

V1

No

Hortonworks

The Hortonworks Connector is a MapReduce-enabled crawler that is compatible with Hortonworks Data Platform v2.x.

2.x

V1

No

MapR

The MapR Connector is a MapReduce-enabled crawler that is compatible with MapR v4.x, v5.x.

4.x, 5.x

V1

No

Push content connectors

Name

Description

Source Compatibility

Platform Version

Security Trimming

Solr Push Endpoint

The Solr Push Endpoint accepts documents and pushes them to Solr using the Fusion index pipelines. You might use this, for example, if you are indexing Solr XML documents from a content management system that natively integrates with Solr, for example using SolrJ.

All

V1

No

Repository connectors

Name

Description

Source Compatibility

Platform Version

Security Trimming

Active Directory Connector for ACLs

The Active Directory Connector for ACLs indexes Active Control List (ACL) information into a configured "sidecar" Solr collection, so that it can be used by other connectors.

N/A

V1

Yes

Alfresco

The Alfresco Connector is a crawler for the Alfresco server, which adheres to the Content Management Interoperability Services (CMIS) standard.

CMIS 1.1 compliant versions

V1

Yes

Azure

The Azure connector is used to crawl an Azure instance. Its connector type is "lucid.azure" and its plugin type is "azure".

Blob and Table storage

V1

No

Confluence

Retrieve data from the Atlassian Confluence Wiki CMS. You can configure this datasource to crawl pages, spaces, blog posts, comments, and attachments.

Confluence Server 5.5 and later

Confluence Cloud

V1

Yes

Drupal

This connector uses Drupal’s Services 7.x­3.11Module REST API.

Drupal 7.x

V1

No

GitHub

The GitHub connector retrieves data from GitHub repositories using the GitHub REST API.

N/A

V1

No

JIRA

The JIRA connector retrieves data from a instance of Atlassian’s JIRA issue tracking system.

6.x, 7.x

V1

Yes

Salesforce

Salesforce REST API to extract data from a Salesforce repository via a Salesforce Connected App.

N/A

V1

Yes

ServiceNow

The ServiceNow Datasource retrieves data from ServiceNow repository via the ServiceNow REST API. ServiceNow records are stored in named tables.

N/A

V1

Yes

SharePoint V1 Optimized

Unresolved directive in <stdin> - include::/fusion-server/reference-guides/connectors/sharepointopt-connector-and-datasource-configuration.asciidoc[tag=intro]

2010, 2013, 2016, Online

V1

Yes

SharePoint (Deprecated after 4.2.3)

The SharePoint connector retrieves content and metadata from an on-premises SharePoint repository.

2010, 2013, 2016, Online

V1

Yes

SharePoint Online V1 Optimized

Unresolved directive in <stdin> - include::/fusion-server/reference-guides/connectors/sharepointopt-online-connector-and-datasource-configuration.asciidoc[tag=intro]

N/A

V1

Yes

SharePoint Online (Deprecated after 4.2.3)

The SharePoint Online connector retrieves data from cloud-based SharePoint repositories. Authentication requires a Sharepoint user who has permissions to access Sharepoint via the SOAP API. This user must be registered with the Sharepoint Online authentication server; it is not necessarily the same as the user in Active Directory or LDAP.

N/A

V1

Yes

Solr Index

A Solr connector pulls documents from an external standalone Solr instance or SolrCloud cluster using Solr’s javabin response type and streaming response parser.

All

V1

No

Subversion

This connector requires a Subversion client that is compatible with JavaHL.

1.8 and below

V1

No

Zendesk

The Zendesk connector uses the Zendesk REST API to retrieve tickets and their associated comments and attachments from a Zendesk repository.

N/A

V1

Yes

Script connectors

Name

Description

Source Compatibility

Platform Version

Security Trimming

Javascript

The Javascript connector allows users to write ad-hoc document retrieval routines to fetch content from filesystems and websites.

All

V1

No

Social media connectors

Name

Description

Source Compatibility

Platform Version

Security Trimming

Jive

Retrieve content from a Jive instance.

REST API +V3.12

V1

Yes

Slack

The Slack connector is used to retrieve data from a Slack service. The connector sends requests to the Slack REST API.

All

V1

Yes

Twitter Search

The Twitter Search connector uses Twitter’s search API to query Twitter for tweets that match specific parameters. It allows querying for any keyword, location or other query terms.

All

V1

No

Twitter Stream

The Twitter Stream connector uses Twitter’s streaming API to continually index Twitter. The datasource can be configured to limit tweets or it can be run indefinitely, until Twitter cuts off your access or you stop the datasource. This connector only retrieves tweets created after the datasource has been started.

All

V1

No

Web connectors

Name

Description

Source Compatibility

Platform Version

Security Trimming

Web

The Web connector retrieves data from a Web site using HTTP and starting from a specified URL.

N/A

Yes

No

Installing a connector

Connectors are installed by uploading them to the blob store. You can install connectors:

  • By installing connectors as "bootstrap plugins", that is, by putting them in the bootstrap-plugins directory during initial installation or an upgrade

  • By using the Fusion UI after installation or an upgrade

  • By using the Blob Store API after installation or an upgrade.

Note
During upgrades, the migrator handles some aspects of installing connectors. Depending on the target version and the presence or absence of an Internet connection, there might be manual steps. Installing connectors during upgrades is explained where needed in the upgrade procedures.

Installing a connector as a bootstrap plugin

Fusion can install connectors as "bootstrap plugins." All this means is that you put the connector zip files in a specific directory named bootstrap-plugins, and Fusion installs the connectors the first time it starts during initial installation or an upgrade.

How to install a connector as a bootstrap plugin
  1. Download the connector zip file from http://lucidworks.com/connectors/.

    Don’t expand the archive; Fusion consumes it as-is. Also, don’t start Fusion until instructed to do so by the installation or upgrade instructions.

  2. Under the version-numbered Fusion directory, place the connector in the directory apps/connectors/bootstrap-plugins/ (on Unix) or \apps\connectors\bootstrap-plugins\ (on Windows).

  3. Start Fusion when instructed to do so in the installation or upgrade procedure.

Installing a connector using the Fusion UI

  1. Download the connector zip file from http://lucidworks.com/connectors/.

    Do not expand the archive; Fusion consumes it as-is.

  2. In the Fusion workspace, navigate to System > Blobs.

  3. Click Add.

  4. Select Connector Plugin.

    Add a connector

    The "New Connector Plugin Upload" panel appears.

  5. Click Choose File and select the downloaded zip file from your file system.

    Upload a connector

  6. Click Upload.

    The new connector’s blob manifest appears.

    Uploaded connector

    From this screen you can also delete or replace the connector.

Installing a connector using the Blob Store API

  1. Download the connector zip file from http://lucidworks.com/connectors/.

    Do not expand the archive; Fusion consumes it as-is.

  2. Upload the connector zip file to Fusion’s blob store.

    Specify an arbitrary blob ID, and a resourceType value of plugin:connector, as in this example:

    curl -H 'content-type:application/zip' -X PUT 'localhost:8764/api/blobs/myplugin?resourceType=plugin:connector' --data-binary @myplugin.zip

    Fusion automatically publishes the event to the cluster, and the listeners perform the connector installation process on each node.

    Tip
    If the blob ID is identical to an existing one, the old connector will be uninstalled and the new connector will installed in its place. To get the list of existing blob IDs, run: curl -u user:pass localhost:8764/api/blobs
  3. Look in fusion/4.1.x/apps/connectors/plugins/ to verify that the new connector is installed.

Updating a connector

On Unix, you can update a connector by simply uploading the new one. Fusion overwrites the old one, and no restart is needed.

On Windows, a different procedure is needed:

How to update a Fusion connector on Windows
  1. Delete the old connector, as explained below.

  2. Restart Fusion.

  3. Upload the new connector.

Deleting a connector

You can delete a connector using the Fusion UI or the Blob Store API.

Deleting a connector using the Fusion UI

  1. In the Fusion UI, navigate to System > Blobs.

  2. Under Connector Plugin, select the connector to delete.

  3. Click Delete Blob.

    Delete a connector

    Fusion prompts you to confirm that you want to delete the blob.

  4. Click Yes, Delete.

    The connector disappears from the blob list.

Deleting a connector using the REST API

  1. Get the list of blobs of the connector plugin type:

    curl -u user:pass http://localhost:8764/api/blobs?resouType=plugin:connector
  2. Locate the connector you want to delete, and copy its ID.

    For example, the Jive connector ID is lucid.jive:

    {
      "name" : "lucid.jive",
      "contentType" : "application/zip",
      "size" : 125302,
      "modifiedTime" : "2017-06-13T17:49:20.171Z",
      "version" : 1570112704530612224,
      "md5" : "7032bf2c038bb2d1e27aee82c056c0fb",
      "metadata" : {
        "connectorBootstrapPluginName" : "lucid.jive",
        "resourceType" : "plugin:connector"
      }
    }
  3. Delete the connector as follows:

    curl -u user:pass -X DELETE http://localhost:8764/api/blobs/<id>

    For example

    curl -u user:pass -X DELETE http://localhost:8764/api/blobs/lucid.jive

    A null response indicates success. You can verify that the connector is deleted like this:

    curl -u user:pass http://localhost:8764/api/blobs | grep lucid.jive