Prerequisites
- Access to a Fusion instance with the appropriate permissions to configure a data source.
- Supported SharePoint deployment (2010, 2013, 2016, 2019, or Online).
- A service account with site collection administrator rights for on-premises SharePoint.
- An app registration with the
Sites.FullControl.All
permission for SharePoint Online. - Azure Active Directory app registration with required permissions (for SharePoint Online).
Verify your connector versionThis connector depends on specific Fusion versions. See the following table for the required versions:
Fusion version | Connector version | Notes |
---|---|---|
Fusion 5.9.1 and later | v2.0.0 and later | Supports LDAP ACLs integrations and security trimming. |
Fusion 5.9.0 | v1.6.0 and later. Lucidworks recommends using the latest supported connector version. | Fusion 5.9.0 supports the v2.0.0 connector, but does not support LDAP ACLs integrations or security trimming. |
Fusion 5.6.1 and later | v1.1.0 through v1.6.0 | - |
Business context
SharePoint is widely adopted for internal content collaboration, knowledge sharing, and structured document management. The SharePoint Optimized V2 connector supports use cases across knowledge management and business environments by making SharePoint content discoverable in Fusion.Knowledge management
Organizations use SharePoint as a central platform for storing and organizing internal knowledge, such as policies, training materials, and procedural documentation. With structured libraries, metadata tagging, and version control, teams can maintain accurate, searchable content. The connector brings this content into Fusion to support enterprise search, self-service portals, and role-based access to knowledge resources.B2B
In business-to-business contexts, SharePoint supports cross-functional collaboration, secure partner portals, and project documentation workflows. Companies use it to share content with clients, vendors, and internal stakeholders while maintaining strict access controls. The connector indexes this content for a unified search experience across departments or applications, enabling faster access to contracts, technical documents, or onboarding materials.B2C
SharePoint is not typically used for direct customer-facing experiences. However, B2C organizations often use it internally to support customer service, compliance, or product support operations. With the connector, teams can expose curated SharePoint content such as FAQs, internal product specifications, or support documentation through public or authenticated search interfaces managed in Fusion.How it works
The SharePoint Optimized V2 connector crawls and indexes structured content from SharePoint using its object model. SharePoint organizes data into a hierarchical structure that begins with a web application and includes site collections, sites, lists, folders, and list items. For an overview of this structure, see the SharePoint object model.
- A site collection can contain multiple subsites, each with its own permissions and content.
- A list can store structured data like announcements or documents, with fields such as title, status, and date.
- A list item or document may contain file content and metadata, such as the file type, size, author, and version.
Set up
To index SharePoint content with Fusion, you must configure both the SharePoint environment and the connector. SharePoint must be accessible and properly permissioned, and the connector must be configured to authenticate, crawl, and index the desired content.Required SharePoint permissions
The SharePoint Optimized V2 connector requires specific permissions to index content across SharePoint sites effectively. These permissions must be granted before the connector can perform operations such as reading site data, accessing files, or collecting audit logs. The following table lists the required permissions, along with detailed descriptions and use cases for each. Full indexing functionality, including support for all site collections and audit data, depends on granting the appropriate elevated permissions.Permission | Description | Use case |
---|---|---|
Sites.Read.All | Grants read-only access to all SharePoint site collections and their content. | Used by applications that need to enumerate sites, retrieve metadata, list items, or download files across all sites. |
Sites.Manage.All | Provides full control over all site collections, excluding permission management. | Enables applications to create, edit, and delete content such as list items and documents across all sites. |
Files.Read.All | Allows read-only access to all files stored in SharePoint and OneDrive document libraries. | Used to access and download documents, attachments, and other file content across the organization. |
AuditLog.Read.All | Enables access to SharePoint audit logs for monitoring user and system activities. | Used to analyze actions such as file edits, deletions, and permission changes for compliance and security auditing. |
Sites.Selected | Allows access only to specific site collections explicitly granted by an administrator. | Used to restrict application access to approved SharePoint sites, providing fine-grained control over data exposure. |
User.Read | Permits access to basic profile information of the signed-in user, including name and email. | Used to personalize the user experience or perform operations on behalf of the authenticated user. |
Directory.Read.All | Grants read access to the full directory of users, groups, and other directory objects. | Used to look up user and group information for features like permissions mapping, people pickers, or organizational insights. |
FAQWhy are Full Control permissions required?Full Control allows the connector to discover all site collections and content in SharePoint. Without it, the connector can only access content where it already has permission and may miss sites or documents.Granting Full Control does not allow the SharePoint Optimized V2 connector to take destructive actions such as deleting or modifying content. The permission is used strictly for discovery and indexing.For environments with data transfer security concerns, the SharePoint Optimized V2 connector can be deployed as a remote connector. This enables Fusion to index content stored behind firewalls without opening firewall ports or exposing internal systems.
Prepare SharePoint
Before you configure the SharePoint Optimized V2 connector in Fusion, you must prepare your SharePoint environment. This section explains how to select an authentication method, assign the required permissions, and ensure access to the SharePoint content you intend to index.Configure the connector in Fusion
In Fusion, configure the SharePoint Optimized V2 connector to define the crawl scope, select an authentication method, and apply indexing settings. This setup enables Fusion to connect to SharePoint and index content securely and efficiently.This section applies to the latest version of the SharePoint Optimized V2 connector. If you are using an earlier version, some settings may have different names, appear in different sections, or may not be available.
Core configurations
Use the following parameters to configure what the connector crawls, how it authenticates, and how it handles content updates. Parameters are grouped by category for clarity. For full configuration options, see Configuration specifications.Scope
These settings define what SharePoint content the connector includes in the crawl. You can configure it to crawl all site collections or only specific sites, lists, folders, or items. To begin, specify the Web Application URL. This is the base URL of your SharePoint web application. All paths to site collections or items must be relative to this URL.
In earlier versions of the SharePoint Optimized V2 connector, you had to enable Fetch all site collections to crawl all site collections. This setting was removed in v2.0.0 and later. If you are using an older version and see this option in your configuration, enable it to ensure all site collections are crawled.
-
In the Site Collection List, provide a single site collection path.
- In Restrict to specific SharePoint items, enter one or more SharePoint URLs. You can copy these URLs directly from your browser.
Authentication
These settings define how the connector authenticates with SharePoint. For on-premises SharePoint, use NTLM authentication.
Crawl behavior
These settings control how the connector detects content changes and whether it performs incremental or full crawls. After the first successful full crawl, the connector uses incremental crawls by default. It relies on the SharePoint Changes API to detect and index added, updated, and deleted content. The connector tracks changes using a change token and removes deleted site collections. To support incremental crawling, required fields prefixed withlw
must remain in the indexed documents.
To force a full crawl, enable the Force Full Crawl setting. This disables incremental crawling and reindexes all content from scratch. It also clears any previous crawl state. This option is useful when resetting the crawl due to major changes.
Incremental crawling requires Force Full Crawl to remain disabled.
Use cases
The SharePoint Optimized V2 connector supports a range of use cases for securely indexing and searching SharePoint content. This section highlights common deployment scenarios and configurations that help organizations meet security, compliance, and infrastructure requirements.Protect documents with security trimming
Security trimming ensures that users only see the content they are authorized to access when querying your SharePoint Optimized V2 datasource. It uses user roles and permissions to control document visibility. This feature requires the SharePoint Optimized V2 datasource to be used alongside an LDAP ACLs V2 datasource and a Graph Security Trimming query stage within the same app and collection. Benefits of using security trimming include:- Ensuring users only see documents they are authorized to access, maintaining data confidentiality.
- Reducing irrelevant search results by filtering content based on user permissions.
- Enhancing compliance with organizational security policies through integrated role-based access control.
Configure security trimming for SharePoint Optimized V2
Configure security trimming for SharePoint Optimized V2
Index data behind firewalls
You can configure the SharePoint Optimized V2 Connector v2.0.0 and later to run remotely. This setup lets Fusion index content stored behind firewalls without exposing internal systems or opening firewall ports. It helps protect sensitive data, supports compliance, and enables unified search across cloud and on-premises sources. Fusion uses gRPC over HTTP/2 to connect on-premises remote connectors to Fusion clusters. For more information, see Remote V2 connectors.Configure remote V2 connectors
Configure remote V2 connectors
If you need to index data from behind a firewall, you can configure a V2 connector to run remotely on-premises using TLS-enabled gRPC.The gRPC connector backend is not supported in Fusion environments deployed on AWS.The
Prerequisites
Before you can set up an on-prem V2 connector, you must configure the egress from your network to allow HTTP/2 communication into the Fusion cloud. You can use a forward proxy server to act as an intermediary between the connector and Fusion.The following is required to run V2 connectors remotely:- The plugin zip file and the connector-plugin-standalone JAR.
- A configured connector backend gRPC endpoint.
- Username and password of a user with a
remote-connectors
oradmin
role. - If the host where the remote connector is running is not configured to trust the server’s TLS certificate, you must configure the file path of the trust certificate collection.
If your version of Fusion doesn’t have the
remote-connectors
role by default, you can create one. No API or UI permissions are required for the role.Connector compatibility
Only V2 connectors are able to run remotely on-premises. You also need the remote connector client JAR file that matches your Fusion version. You can download the latest files at V2 Connectors Downloads.Whenever you upgrade Fusion, you must also update your remote connectors to match the new version of Fusion.
System requirements
The following is required for the on-prem host of the remote connector:- (Fusion 5.9.0-5.9.10) JVM version 11
- (Fusion 5.9.11) JVM version 17
- Minimum of 2 CPUs
- 4GB Memory
Enable backend ingress
In yourvalues.yaml
file, configure this section as needed:-
Set
enabled
totrue
to enable the backend ingress. -
Set
pathtype
toPrefix
orExact
. -
Set
path
to the path where the backend will be available. -
Set
host
to the host where the backend will be available. -
In Fusion 5.9.6 only, you can set
ingressClassName
to one of the following:nginx
for Nginx Ingress Controlleralb
for AWS Application Load Balancer (ALB)
-
Configure TLS and certificates according to your CA’s procedures and policies.
TLS must be enabled in order to use AWS ALB for ingress.
Connector configuration example
Minimal example
Logback XML configuration file example
Run the remote connector
logging.config
property is optional. If not set, logging messages are sent to the console.Test communication
You can run the connector in communication testing mode. This mode tests the communication with the backend without running the plugin, reports the result, and exits.Encryption
In a deployment, communication to the connector’s backend server is encrypted using TLS. You should only run this configuration without TLS in a testing scenario. To disable TLS, setplain-text
to true
.Egress and proxy server configuration
One of the methods you can use to allow outbound communication from behind a firewall is a proxy server. You can configure a proxy server to allow certain communication traffic while blocking unauthorized communication. If you use a proxy server at the site where the connector is running, you must configure the following properties:- Host. The hosts where the proxy server is running.
- Port. The port the proxy server is listening to for communication requests.
- Credentials. Optional proxy server user and password.
Password encryption
If you use a login name and password in your configuration, run the following utility to encrypt the password:- Enter a user name and password in the connector configuration YAML.
-
Run the standalone JAR with this property:
- Retrieve the encrypted passwords from the log that is created.
- Replace the clear password in the configuration YAML with the encrypted password.
Connector restart (5.7 and earlier)
The connector will shut down automatically whenever the connection to the server is disrupted, to prevent it from getting into a bad state. Communication disruption can happen, for example, when the server running in theconnectors-backend
pod shuts down and is replaced by a new pod. Once the connector shuts down, connector configuration and job execution are disabled. To prevent that from happening, you should restart the connector as soon as possible.You can use Linux scripts and utilities to restart the connector automatically, such as Monit.Recoverable bridge (5.8 and later)
If communication to the remote connector is disrupted, the connector will try to recover communication and gRPC calls. By default, six attempts will be made to recover each gRPC call. The number of attempts can be configured with themax-grpc-retries
bridge parameters.Job expiration duration (5.9.5 only)
The timeout value for irresponsive backend jobs can be configured with thejob-expiration-duration-seconds
parameter. The default value is 120
seconds.Use the remote connector
Once the connector is running, it is available in the Datasources dropdown. If the standalone connector terminates, it disappears from the list of available connectors. Once it is re-run, it is available again and configured connector instances will not get lost.Enable asynchronous parsing (5.9 and later)
To separate document crawling from document parsing, enable Tika Asynchronous Parsing on remote V2 connectors.API operations
This section provides a simple example of how to use the Connectors API to list available connector plugins, demonstrating how to interact with the API to discover which datasources are supported. For more detailed examples, including full request and response payloads and the configuration specification used with the SharePoint Optimized V2 connector, see the Connector APIs documentation.Get all available connectors
RequestTroubleshooting
This section describes known limitations and configuration requirements for the SharePoint Optimized V2 Connector. Each issue includes the observed behavior, the expected behavior, and the impact to users.Connector runs on multiple pods
The SharePoint Optimized V2 Connector does not support running on more than one pod. If multiple instances run at the same time, they may try to index the same content, which can cause duplication, crawl errors, or inconsistent results. The connector is designed to run as a single instance. To ensure reliable indexing, deploy the connector on only one pod. This means you should only run one copy of the connector. Keeping it to one copy helps everything work correctly.Connector version compatibility
If you use the SharePoint Optimized V2 Connector with an ACL connector, make sure the versions are compatible. Incompatible versions can prevent document-level access controls from being applied correctly. This can result in users seeing content they shouldn’t or missing content they should be able to access. To avoid access issues, use only supported combinations of the SharePoint and ACL connectors. Check version compatibility in Prerequisites. For details about crawls and incremental crawls see Crawl using the SharePoint Optimized V2 connector.Avoid throttling and rate limiting in SharePoint Online
SharePoint Online enforces rate limits to protect its APIs. When that happens, it tells the connector to slow down by sending error messages. When the connector sends too many requests in parallel, SharePoint may respond with429 Too Many Requests
or 503 Server Too Busy
errors. These indicate that the service is temporarily rejecting traffic due to overload.
To avoid these errors, reduce the number of concurrent requests. In the connector configuration, go to Core Properties > Fetch Settings and lower the Fetch Threads value. Also consider reducing the number of connector jobs running at the same time.


Retries help with occasional limits, but persistent
429
or 503
errors mean you’re sending too much traffic. Reduce request volume first. Only use retries to improve resilience, not to bypass throttling.More resources
For more information on how to plan, install, and configure the SharePoint Optimized V2 connector:- Overview of SharePoint and SharePoint Online connectors. Learn about the available SharePoint connectors and how they compare.
- Download Connectors. Get the latest version of the SharePoint Optimized V2 connector package.
- Install or update a connector. Follow step-by-step instructions to install a connector into Fusion.
- Crawl using the SharePoint Optimized V2 connector. Configure and run crawls using scoped collections, inclusion filters, and other crawl settings.
Install or update a connector
Install or update a connector
When you create a new datasource that requires an uninstalled connector, Fusion releases 5.2 and later automatically download and install the connector using the Datasources dropdown. You can also update the connector using the Blob Store UI or via the Connector API.
Install a connector using the Datasources dropdown
- In your Fusion app, navigate to Indexing > Datasources.
- Click Add.
- In the list of connectors, scroll down to the connectors marked Not Installed and select the one you want to install. Fusion automatically downloads it and moves it to the list of installed connectors.
You can view and download all current and previous V2 connector releases at Download Connectors.
Install or update a connector using the Blob Store UI
- Download the connector zip file from Download V2 connectors.
Do not expand the archive; Fusion consumes it as-is.
- In your Fusion app, navigate to System > Blobs.
- Click Add.
- Select Connector Plugin.
The “New Connector Plugin Upload” panel appears. - Click Choose File and select the downloaded zip file from your file system.
- Click Upload.
The new connector’s blob manifest appears.
From this screen you can also delete or replace the connector.
Wait several minutes for the connector to finish uploading to the blob store before installing the connector using the Datasources dropdown.
Install or update a connector using the Connector API
-
Download the connector zip file from Download V2 connectors.
Do not expand the archive; Fusion consumes it as-is.
-
Upload the connector zip file to Fusion’s plugins.
Specify a
pluginId
as in this example:Fusion automatically publishes the event to the cluster, and the listeners perform the connector installation process on each node.If thepluginId
is identical to an existing one, the old connector will be uninstalled and the new connector will be installed in its place. To get the list of existing plugin IDs, run:curl -u USERNAME:PASSWORD https://FUSION_HOST:FUSION_PORT/api/connectors/plugins
-
Look in
https://FUSION_HOST:FUSION_PORT/apps/connectors/plugins/
to verify the new connector is installed.
Reinstall a connector
To reinstall a connector for any reason, first delete the connector then use the preceding steps to install it again. This may take a few minutes to complete depending on how quickly the pods are deleted and recreated.Crawl using the SharePoint Optimized V2 connector
Crawl using the SharePoint Optimized V2 connector
LucidAcademyLucidworks offers free training to help you get started.The Microlearning for Connectors 101 focuses on Learn how connectors work to get data into Fusion:Visit the LucidAcademy to see the full training catalog.
Configuration
To change the number of items to retrieve per page, set the value ofapiQueryRowLimit
. The default value is 5000
.
To change the number of change events to retrieve per page, set the value of changeApiQueryRowLimit
. The default value is 2000
.
When entering configuration values in the UI, use unescaped characters, such as
\t
for the tab character. When entering configuration values in the API, use escaped characters, such as \\t
for the tab character.