Product Selector

Fusion 5.12
    Fusion 5.12

    Configure a SharePoint V2 Datasource

    1. Decide what to crawl

    Determine what to crawl and select one of the following:

    How to crawl an entire SharePoint Web application

    1. Verify the Limit Documents > Fetch all site collections option is selected (default).

    2. Specify the Web application URL as a site.

      For example: https://lucidworks.sharepoint.local/

    Administrative access to SharePoint is required to crawl an entire SharePoint Web application.

    How to crawl a subset of SharePoint site collections

    1. Uncheck the Limit Documents > Fetch all site collections option.

    2. Specify a "Start Link" for each site collection to crawl.

      Examples include:

      • https://lucidworks.sharepoint.local/sites/site1

      • https://lucidworks.sharepoint.local/sites/site2

      • https://lucidworks.sharepoint.local/sites/site3

    How to crawl a specific sub-site, list, or list item:

    1. Uncheck the Limit Documents > Fetch all site collections option.

    2. Specify a "Start Link" for each site collection that contains the item to fetch.

    3. Specify a non-wildcard Inclusive Regular Expression for each parent.

      For example, if you want to crawl https://lucidworks.sharepoint.local/sites/mysitecol/myparentsite/somesite, then you must include inclusive regexes for all parents:

      https\:\/\/lucidworks\.sharepoint\.local\/sites\/mysitecol
      https\:\/\/lucidworks\.sharepoint\.local\/sites\/mysitecol\/myparentsite
      https\:\/\/lucidworks\.sharepoint\.local\/sites\/mysitecol\/somesite
      https\:\/\/lucidworks\.sharepoint\.local\/sites\/mysitecol\/somesite\/.*
      If you exclude a parent item of the site, the connector does not crawl the site because it will not spider down to it during the crawl process.

    2. Create permission and user policy for the crawl

    The options are:

    • Set up an on-prem crawl account with only as much permission as it needs.

      This approach has the security advantage of providing minimal access to Fusion. However, the crawl account cannot retrieve the list of site collections behind a Web application URL.

    • Set up an online crawl account with only as much permission as it needs.

      This approach has the security advantage of providing minimal access to Fusion. However, the crawl account cannot retrieve the list of site collections behind a Web application URL.

    • Provide administrative access to crawl

    How to set up an on-prem crawl account

    Create a permission policy level

    1. Navigate to Central Administration > Manage web application > Permission Policy.

    2. Select Add permission policy level. In this example, the permission level is named fusion_crawl_policy.

    3. If you need to list all site collections in a SharePoint web application, select the Site Collection Auditor option.

      Fusion SharePoint Crawl Permissions

    4. Grant the following permissions:

      • View Items - View items in lists and documents in document libraries.

      • Open Items - View the source of documents with server-side file handlers.

      • View Versions - View past versions of a list item or document.

      • View Application Pages - View forms, views, and application pages. Enumerate lists.

      Site Permissions
      • Browse Directories - Enumerate files and folders in a Web site using SharePoint Designer and Web DAV interfaces.

      • View Pages - View pages in a Web site.

      • Enumerate Permissions - Enumerate permissions on the Web site, list, folder, document, or list item.

      • Browse User Information - View information about users of the Web site.

      • Use Remote Interfaces - Use SOAP, Web DAV, the Client Object Model or SharePoint Designer interfaces to access the Web site.

      • Open - Allows users to open a Web site, list, or folder in order to access items inside that container.

    Grant user permission to the user policy

    1. Navigate to Central Administration > Manage web application > User Policy > Add Users.

    2. Create a new user with the new fusion_crawl_policy permission level selected:

      SharePoint Permission Policy Level

    How to set up an online crawl account

    Create a permission policy level

    1. Navigate to Site settings > Site permissions > Advanced Permission Settings.

    2. Select New permission level. In this example, the permission level is named fusion_crawl_policy.

    3. Grant the following permissions:

      • View Items - View items in lists and documents in document libraries.

      • Open Items - View the source of documents with server-side file handlers.

      • View Versions - View past versions of a list item or document.

      • View Application Pages - View forms, views, and application pages. Enumerate lists.

      Site Permissions
      • Browse Directories - Enumerate files and folders in a Web site using SharePoint Designer and Web DAV interfaces.

      • View Pages - View pages in a Web site.

      • Enumerate Permissions - Enumerate permissions on the Web site, list, folder, document, or list item.

      • Browse User Information - View information about users of the Web site.

      • Use Remote Interfaces - Use SOAP, Web DAV, the Client Object Model or SharePoint Designer interfaces to access the Web site.

      • Open - Allows users to open a Web site, list, or folder in order to access items inside that container.

    Grant user permission

    1. Navigate to Site settings > Site permissions > Advanced Permission Settings.

    2. Select Grant permissions.

    3. Enter the new user name and add the user.

    4. Select a value in the Select a permission level field.

    5. Select Share.

    6. In the Edit Permissions > Choose Permissions section, select the following check boxes:

      • Read. Can view pages and list items and download documents.

      • LW Fusion.

    7. Select OK to save the information.

    If you grant the service account the Site Collection Auditor permission, the Lucidworks Fusion SharePoint connector has write-level permission and can list:

    • Sites in Site Collections

    • SharePoint Site Collection site metadata

    3. Test user permissions

    The following PowerShell script verifies permissions on the user account created to crawl SharePoint from Fusion.

    The script must be run by the user account on which the permissions were set. If rights were granted:

    • On your account, you must run the script to verify the user rights are set correctly.

    • On a different user account, the owner of that account must run the script.

    1. Save the script with following file name: test-sharepoint-permissions.ps1.

    2. Enter the first of the site collection URLs to crawl in the $site_col_url field of the script.

    3. Save the changes.

    Permission verification script

    $site_col_url="https://your.sharepoint.local/sites/mysitecollection"
    
    $cred = (Get-Credential)
    
    if (-not ([System.Management.Automation.PSTypeName]'ServerCertificateValidationCallback').Type)
    {
    $certCallback = @"
        using System;
        using System.Net;
        using System.Net.Security;
        using System.Security.Cryptography.X509Certificates;
        public class ServerCertificateValidationCallback
        {
            public static void Ignore()
            {
                if(ServicePointManager.ServerCertificateValidationCallback ==null)
                {
                    ServicePointManager.ServerCertificateValidationCallback +=
                        delegate
                        (
                            Object obj,
                            X509Certificate certificate,
                            X509Chain chain,
                            SslPolicyErrors errors
                        )
                        {
                            return true;
                        };
                }
            }
        }
    "@
        Add-Type $certCallback
     }
    
    [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.SecurityProtocolType]::Tls12;
    [ServerCertificateValidationCallback]::Ignore()
    
    $headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
    $headers.Add("Content-Type", "text/xml")
    $headers.Add("SOAPAction", "http://schemas.microsoft.com/sharepoint/soap/GetUpdatedFormDigestInformation")
    $headers.Add("X-RequestForceAuthentication", "true")
    $headers.Add("X-FORMS_BASED_AUTH_ACCEPTED", "f")
    
    $body = "<?xml version=`"1.0`" encoding=`"utf-8`"?>`n<soap:Envelope xmlns:xsi=`"http://www.w3.org/2001/XMLSchema-instance`" xmlns:xsd=`"http://www.w3.org/2001/XMLSchema`" xmlns:soap=`"http://schemas.xmlsoap.org/soap/envelope/`">`n  <soap:Body>`n    <GetUpdatedFormDigestInformation xmlns=`"http://schemas.microsoft.com/sharepoint/soap/`" />`n  </soap:Body>`n</soap:Envelope>"
    
    $response = Invoke-RestMethod "${site_col_url}/_vti_bin/sites.asmx" -Method 'POST' -Headers $headers -Body $body -Credential $cred
    
    $digest_value = $response.Envelope.Body.GetUpdatedFormDigestInformationResponse.FirstChild.DigestValue
    
    
    $headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
    $headers.Add("Content-Type", "text/xml")
    $headers.Add("X-RequestForceAuthentication", "true")
    $headers.Add("X-RequestDigest", $digest_value)
    $headers.Add("Accept", "application/json")
    $headers.Add("X-FORMS_BASED_AUTH_ACCEPTED", "f")
    
    $body = @'
    <Request AddExpandoFieldTypeSuffix="true" SchemaVersion="14.0.0.0" LibraryVersion="16.0.0.0"
             ApplicationName=".NET Library" xmlns="http://schemas.microsoft.com/sharepoint/clientquery/2009">
        <Actions>
            <ObjectPath Id="2" ObjectPathId="1"/>
            <ObjectPath Id="4" ObjectPathId="3"/>
            <Query Id="5" ObjectPathId="3">
                <Query SelectAllProperties="false">
                    <Properties>
                        <Property Name="Webs" SelectAll="true">
                            <Query SelectAllProperties="false">
                                <Properties/>
                            </Query>
                        </Property>
                        <Property Name="Title" ScalarProperty="true"/>
                        <Property Name="ServerRelativeUrl" ScalarProperty="true"/>
                        <Property Name="RoleDefinitions" SelectAll="true">
                            <Query SelectAllProperties="false">
                                <Properties/>
                            </Query>
                        </Property>
                        <Property Name="RoleAssignments" SelectAll="true">
                            <Query SelectAllProperties="false">
                                <Properties/>
                            </Query>
                        </Property>
                        <Property Name="HasUniqueRoleAssignments" ScalarProperty="true"/>
                        <Property Name="Description" ScalarProperty="true"/>
                        <Property Name="Id" ScalarProperty="true"/>
                        <Property Name="LastItemModifiedDate" ScalarProperty="true"/>
                    </Properties>
                </Query>
            </Query>
        </Actions>
        <ObjectPaths>
            <StaticProperty Id="1" TypeId="{3747adcd-a3c3-41b9-bfab-4a64dd2f1e0a}" Name="Current"/>
            <Property Id="3" ParentId="1" Name="Web"/>
        </ObjectPaths>
    </Request>
    '@
    
    $response = Invoke-RestMethod "${site_col_url}/_vti_bin/client.svc/ProcessQuery" -Method 'POST' -Headers $headers -Body $body -Credential $cred
    $response | ConvertTo-Json -Depth 100

    Successful query response

    If the test script executes successfully, metadata is returned. The following is a sample of a successful response:

    test-sharepoint-permissions.ps1
    cmdlet Get-Credential at command pipeline position 1
    Supply values for the following parameters:
    [
        {
            "SchemaVersion":  "14.0.0.0",
            "LibraryVersion":  "16.0.10337.12109",
            "ErrorInfo":  null,
            "TraceCorrelationId":  "c419a69f-1c06-b07f-b69b-4d7720fd7756"
        },
        2,
        {
            "IsNull":  false
        },
        4,
        {
            "IsNull":  false
        },
        5,
        {
            "_ObjectType_":  "SP.Web",
            "_ObjectIdentity_":  "c419a69f-1c06-b07f-b69b-4d7720fd7756|740c6a0b-85e2-48a0-a494-e0f1759d4aa7:site:8992a373-cdf0-4262-b240-9527c7174682:web:2080d74c-e181-43df-829f-ad5bee97b6f8",
            "Webs":  {
                         "_ObjectType_":  "SP.WebCollection",
                         "_Child_Items_":  [
                                               {
                                                   "_ObjectType_":  "SP.Web",
           ... truncated for brevity ...
    
            "LastItemModifiedDate":  "\/Date(1603731388000)\/"
        }
    ]

    Failed query response

    If the test script fails, either:

    • An error code is generated. For example, an error code 401.

    • An error message with explanatory information is returned. The following is a sample of a failed response:

    Credential
    Invoke-RestMethod : The remote server returned an error: (401) Unauthorized.
    At C:\Users\nicho\Documents\test-sharepoint-permissions.ps1:47 char:13
    + $response = Invoke-RestMethod "${site_col_url}/_vti_bin/sites.asmx" - ...
    +             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        + CategoryInfo          : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-RestMethod], WebExc
       eption
        + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeRestMethodCommand
    
    Invoke-RestMethod : The remote server returned an error: (401) Unauthorized.
    At C:\Users\nicho\Documents\test-sharepoint-permissions.ps1:100 char:13
    + $response = Invoke-RestMethod "${site_col_url}/_vti_bin/client.svc/Pr ...
    +             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        + CategoryInfo          : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-RestMethod], WebExc
       eption
        + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeRestMethodCommand