Improve SharePoint Performance

The SharePoint connector retrieves content and metadata from an on-premises SharePoint repository.

When a user performs a search query with SharePoint security trimming enabled, the security trimming process starts by fetching the user groups. Two types of groups reference SharePoint document ACLs:

  • SharePoint groups

  • Active Directory LDAP security groups

Fusion creates a security filter using this user’s loginName and the groups that they are part of. The security filter is a Solr fq filter. Once the security filter is created, when this user performs a query, she sees only the documents that she is supposed to see.

By default, Fusion looks up a user’s SharePoint groups and LDAP groups every time a search query is performed. But fetching groups for a user is expensive and can hurt query time.

Note, too, that each SharePoint site collection that is part of a datasource has its own unique SharePoint groups. This means that if Fusion has crawled multiple SharePoint site collections, it must look up a user’s groups from each site collection. This is done in parallel for speed, but it dominates query times if there are many site collections to query.

Another consideration is SharePoint or LDAP unplanned down time. If a user performs real-time group lookups during down time, her queries result in missing documents because the security filter is not available.

To help alleviate these issues, Fusion offers a few different caching options. Consider using these caching options if you have many site collections, need extremely fast search, or cannot tolerate SharePoint or LDAP outages.

Security Filter Cache

With a security filter cache, once the query filter for a user has been generated, Fusion reuses the filter for this user for subsequent queries. The cache_expiration_time parameter dictates how long Fusion reuses the filter until generating it again. The cache_max_size parameter dictates the maximum number of items to hold in the security filter cache.

There are two flavors of security filter caches.

  • Local - This security filter is used only locally for this datasource. All other SharePoint datasources in your SharePoint security trimming query pipeline do not have access to this security filter.

  • Global - This security filter is used between multiple SharePoint datasources. So if groups have already been looked up for LDAP or a SharePoint site collection in another site collection, they do not have to be looked up again.

To enable security filter caches:
  1. In the Fusion UI, navigate to SharePoint datasource configuration.

  2. Check the boxes labeled Enable local security filter cache and Enable global security filter cache.

User Group Cache

To make queries significantly faster and also prevent security trimming from failing if any of those other systems happen to be down, you can enable user group caching.

The biggest bottleneck in security trimming for the SharePoint connector is looking up each user’s groups. Caching user groups means that when a security trimming query is performed, a single query to a Solr collection looks up the user’s LDAP and SharePoint groups, instead of going to the LDAP and SharePoint services to get them.

Every time Fusion crawls the SharePoint datasource, it updates the user group cache.

There are some costs of user group caching:

  • Increased indexing time - Fusion needs to build a user group cache while indexing.

  • Stale user groups - If Fusion does not recrawl the SharePoint datasource often enough, the user group cache can get out of date. The more often Fusion recrawls, the closer it is to a realtime user group lookup.

To enable user group caching:
  1. In the Fusion UI, navigate to SharePoint datasource configuration.

  2. Check the box labeled Enable User Group Caching in Solr. At crawl time each User’s LDAP and SharePoint groups will be fetched and stored in a Solr collection.

  3. (Optional) Set User Group Cache Solr Collection Name for all of your SharePoint data sources to the same name (for example, sp_usr_grp). The default is sp_usr_grp_<datasource>, where <datasource> is the ID of your data source. But several SharePoint data sources can share the same Solr collection, and performing this step prevents multiple collections from being created.