Product Selector

Fusion 5.12
    Fusion 5.12

    Sitecore V2Connector Configuration Reference

    Table of Contents

    This connector provides full crawl and incremental crawl support for Sitecore versions 8.x and 9.x. It indexes document content and all metadata.

    To configure this connector, you need:

    • Your Sitecore URL

    • One or more content paths

    • A Sitecore administrator username and password

    Remote connectors

    V2 connectors support running remotely in Fusion versions 5.7.1 and later. Refer to Configure Remote V2 Connectors.

    Below is an example configuration showing how to specify the file system to index under the connector-plugins entry in your values.yaml file:

    additionalVolumes:
    - name: fusion-data1-pvc
        persistentVolumeClaim:
        claimName: fusion-data1-pvc
    - name: fusion-data2-pvc
        persistentVolumeClaim:
        claimName: fusion-data2-pvc
    additionalVolumeMounts:
    - name: fusion-data1-pvc
        mountPath: "/connector/data1"
    - name: fusion-data2-pvc
        mountPath: "/connector/data2"

    You may also need to specify the user that is authorized to access the file system, as in this example:

    securityContext:
        fsGroup: 1002100000
        runAsUser: 1002100000

    Configuration

    When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.

    Connector for SiteCore

    description - string

    Optional description

    <= 125 characters

    pipeline - stringrequired

    Name of the IndexPipeline used for processing output.

    >= 1 characters

    Match pattern: ^[a-zA-Z0-9_-]+$

    diagnosticLogging - boolean

    Enable diagnostic logging; disabled by default

    Default: false

    parserId - stringrequired

    The Parser to use in the associated IndexPipeline.

    coreProperties - Core Properties

    Common behavior and performance settings.

    fetchSettings - Fetch Settings

    System level settings for controlling fetch behavior and performance.

    indexMetadata - boolean

    When enabled the metadata of skipped items will be indexed to the content collection.

    Default: false

    numFetchThreads - number

    Maximum number of fetch threads; defaults to 5.This setting controls the number of threads that call the Connectors fetch method.Higher values can, but not always, help with overall fetch performance.

    >= 1

    <= 500

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 5

    Multiple of: 1

    indexingThreads - number

    Maximum number of indexing threads; defaults to 4.This setting controls the number of threads in the indexing service used for processing content documents emitted by this datasource.Higher values can sometimes help with overall fetch performance.

    >= 1

    <= 10

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 4

    Multiple of: 1

    pluginInstances - number

    Maximum number of plugin instances for distributed fetching. Only specified number of plugin instanceswill do fetching. This is useful for distributing load between different instances.

    <= 500

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 0

    Multiple of: 1

    fetchResponseScheduledTimeout - number

    The maximum amount of time for a response to be scheduled. The task will be canceled if this setting is exceeded.

    >= 1000

    <= 500000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 300000

    Multiple of: 1

    indexingInactivityTimeout - number

    The maximum amount of time to wait for indexing results (in seconds). If exceeded, the job will fail with an indexing inactivity timeout.

    >= 60

    <= 691200

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 86400

    Multiple of: 1

    pluginInactivityTimeout - number

    The maximum amount of time to wait for plugin activity (in seconds). If exceeded, the job will fail with a plugin inactivity timeout.

    >= 60

    <= 691200

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 600

    Multiple of: 1

    indexContentFields - boolean

    When enabled, content fields will be indexed to the crawl-db collection.

    Default: false

    asyncParsing - boolean

    When enabled, content will be indexed asynchronously.

    Default: false

    id - stringrequired

    A unique identifier for this Configuration.

    >= 1 characters

    Match pattern: ^[a-zA-Z0-9_-]+$

    properties - Plugin Configuration

    Plugin specific properties.

    sitecoreServerURL - string

    SiteCore server root URL (https://<server>)

    sitecoreUsername - string

    An administrator SiteCore username

    sitecorePassword - string

    The password of the administrator user

    itemPaths - array[string]

    A list of SiteCore Item paths to be indexed ex. '/sitecore/content/Home'

    Default: "/sitecore/content/Home"

    sitecoreDatabase - string

    SiteCore context database to use when accessing SiteCore. For example: master, web, core. If you leave this blank, the connector will pick the default database associated with your current user.

    indexMetadata - boolean

    If enabled, the item metadata will be added to the indexed document

    Default: true

    fetchRetryProperties - Retry Options

    A set of options for configuring retry behavior.

    maxDelayTimeMs - number

    The maximum time wait time between successive retries.

    >= 1

    <= 600000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 300000

    Multiple of: 1

    maxTimeLimitMs - number

    This setting is used to limit the maximum amount of time spent on retries. Note: this will be ignored if "Maximum Retries" is specified.

    >= 1

    <= 28800000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 600000

    Multiple of: 1

    errorExclusions - array[string]

    Optional regex list that will be matched against failed attempts exception class and message. If any regex matches, do not retry this request. This is needed to prevent the retryer from retrying non-recoverable errors that were not already ignored by the connector implementation.

    maxRetries - number

    The retryer will retry failed operations in the case that they might succeed if attempted again. This parameter states the number of attempts to retry until giving up. This parameter, if specified, will override the "Stop retrying after time (milliseconds)" parameter.

    <= 100

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 3

    Multiple of: 1

    delayFactor - number

    The retryer will retry failed operations in the case that they might succeed if attempted again. The retryer will sleep an exponential amount of time after the first failed attempt and retry in exponentially incrementing amounts after each failed attempt up to the maximumTime. nextWaitTime = exponentialIncrement * multiplier.

    >= 1

    <= 9999

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 2

    Multiple of: 1

    delayMs - number

    Sets the delay between retries, exponentially backing off to the maxDelayTimeMs and multiplying successive delays by the delayFactor

    >= 1

    <= 9223372036854776000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 1000

    Multiple of: 1

    connections - Http client connection options

    A set of options for configuring the http client behavior.

    maxConnections - number

    The maximum number of connections

    >= 1

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 10

    Multiple of: 1

    ignoreSSLValidationExceptions - boolean

    Do not attempt to do an SSL Handshake and do not verify the hostname of SSL certificates. Use this when accessing an https url with a self-signed or enterprise certificate authority that you do not want to put in the Java keystore.

    Default: false

    timeouts - Http timeout options

    A set of options for configuring the http client timeouts.

    readTimeoutMs - number

    <= 600000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 300000

    Multiple of: 1

    connectTimeoutMs - number

    <= 300000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 60000

    Multiple of: 1

    proxyProperties - Proxy options

    A set of options for configuring the proxy.

    url - string

    The proxy URL

    >= 1 characters

    username - string

    Proxy username

    >= 1 characters

    password - string

    Proxy password

    >= 1 characters

    sizeLimitProperties - Item Size Limits

    Options for including or excluding items based on size, in bytes.

    maxSizeBytes - number

    Used for excluding items when the item size is larger than the configured value.

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: -1

    Multiple of: 1

    minSizeBytes - number

    Used for excluding items when the item size is smaller than the configured value.

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 1

    Multiple of: 1

    extensionConfig - File Extension rules

    includedFileExtensions - array[string]

    Set of file extensions to be fetched. If specified, all non-matching files will be skipped.

    Default:

    excludedFileExtensions - array[string]

    A set of all file extensions to be skipped from the fetch.

    Default:

    regexCacheSize - number

    The number of regex matches to cache when evaluating regular expressions. For example if you exclude files by filename, each filename's regex result will be cached so that if this same filename came up again, the regex matches would be remembered.

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 10000

    Multiple of: 1

    regexConfig - Regular expression rules

    inclusiveRegexes - array[string]

    Regular expressions for URI patterns to include. This will limit this datasource to only URIs that match the regular expression.

    Default:

    exclusiveRegexes - array[string]

    Regular expressions for URI patterns to exclude. This will limit this datasource to only URIs that do not match the regular expression.

    Default:

    regexCacheSize - number

    The number of regex matches to cache when evaluating regular expressions. For example if you exclude files by filename, each filename's regex result will be cached so that if this same filename came up again, the regex matches would be remembered.

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 10000

    Multiple of: 1

    depthLimitConfig - Item Depth Limits

    maxDepth - number

    Maximum depth level for fetch items. If an item has a depth greater than the configured value, it will not be fetched. The default is "no limit" (-1).

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: -1

    Multiple of: 1

    mimeTypeProperties - MIME Types

    Options for including or excluding items, based on MIME types.

    includedMimeTypes - array[string]

    A list of the Mime types to include in this crawl. Note: If you specify includes, the exclude mime types property will be ignored.

    excludedMimeTypes - array[string]

    A list of the Mime types to exclude from this crawl. NOTE: This is only used if the "Mime Type Includes" field is empty.

    databaseProperties - Database properties

    A set of options for configuring the connection to the database used for incremental crawls.

    sitecoreDatabaseHost - string

    The SiteCore SQL Server host. Example: 192.168.1.59 Note: This is only needed for faster incremental updates. If not specified, updates will be done with a recrawl strategy.

    sitecoreDatabasePort - number

    The SiteCore SQL Server port. Typically this is the sql server default of 1433.

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 1433

    Multiple of: 1

    sitecoreDatabaseName - string

    The SiteCore master database's name. Example: sc902_Master

    sitecoreDatabaseUsername - string

    The SiteCore database username. Only SELECT access is required for this user on the sc*_Master database for these tables: [Items, ArchivedItems, PublishQueue].

    sitecoreDatabasePassword - string

    The SiteCore database user password.

    sitecoreDatabaseMaxQuerySize - number

    The maximum number of query results that can be returned from SiteCore database at once.

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 10000

    Multiple of: 1