$FUSION_HOME/var/log/connectors/connectors-classic/connectors-classic.log
and $FUSION_HOME/var/log/connectors/connectors-classic/sharepoint-exporter-DSID.log
where DSID
is the SharePoint optimized datasource ID.View the SharePoint export database file
lw
fields in place. These fields are required for successful incremental crawling.429. Too many requests
This is by far the most common rate limiting error you will see in the logs. This is SharePoint Online’s main mechanism to protect itself from service interruptions due to denial-of-service (DOS) attacks.503. Server too busy
This error is less common, but the result is the same.Avoid SharePoint throttling
Follow the principle of least-privileged: Users should have only the permission levels or individual permissions they must have to perform their assigned tasks.
Account type | Account config | Description |
---|---|---|
Active Directory Service Account | Account is set up as a Site Collection Auditor | Allows you to list all site collections. |
Active Directory Service Account | Account is set up with limited permissions | Does not allow you to list site collections in your SharePoint web application. You must list each site collection you want to crawl manually. Additionally, noindex tags are ignored. Sites will always be indexed regardless of their noindex settings. |
Configure a SharePoint V2 Datasource
Configure a SharePoint V1 Optimized Datasource
Account type | Account config | Description |
---|---|---|
Full Admin | Azure App Only | Allows you to list all site collections in tenant. |
Full Admin | OAuth App Only | Does not allow you to list site collections in your SharePoint web application. You must list each site collection you want to crawl manually. |
ADFS Account | Account is set up as a Site Collection Auditor | Allows you to list all site collections if the user is a tenant administrator. |
ADFS Account | Account is set up with limited permissions | Does not allow you to list site collections in your SharePoint web application. You must list each site collection you want to crawl manually. Use this option if your deployment requires the Lucidworks crawl account to have the fewest privileges possible. |