Product Selector

Fusion 5.12
    Fusion 5.12

    Sitecore V2 Connector Configuration Reference

    Table of Contents

    This connector provides full crawl and incremental crawl support for Sitecore versions 8.x and 9.x. It indexes document content and all metadata.

    To configure this connector, you need:

    • Your Sitecore URL

    • One or more content paths

    • A Sitecore administrator username and password

    Configuration

    When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.

    Connector for Sitecore

    coreProperties - Core Properties

    Common behavior and performance settings.

    fetchSettings - Fetch Settings

    System level settings for controlling fetch behavior and performance.

    fetchItemQueueSize - number

    Size of the fetch item queue.Larger values result in increased memory usage, but potentially higher performance.Default is 10k.

    >= 1

    <= 500000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 10000

    Multiple of: 1

    fetchRequestCheckInterval - number

    The amount of time to wait before check if a request is done

    >= 1000

    <= 500000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 15000

    Multiple of: 1

    fetchResponseCompletedTimeout - number

    The maximum amount of time for a response to be completed. If exceeded, the task will be retried if the job is still running

    >= 1

    <= 600000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 300000

    Multiple of: 1

    fetchResponseScheduledTimeout - number

    The maximum amount of time for a response to be scheduled. The task will be canceled if this setting is exceeded.

    >= 1000

    <= 500000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 300000

    Multiple of: 1

    indexContentFields - boolean

    When enabled, content fields will be indexed to the crawl-db collection

    Default: false

    indexMetadata - boolean

    When enabled the metadata of skipped items will be indexed to the content collection

    Default: false

    numFetchThreads - number

    Maximum number of fetch threads; defaults to 20.This setting controls the number of threads that call the Connectors fetch method.Higher values can, but not always, help with overall fetch performance.

    >= 1

    <= 500

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 20

    Multiple of: 1

    pluginParsing - boolean

    When enabled, raw content is parsed by the plugin to produce documents. When disabled, raw content is streamed from the plugin then parsed by the configured index-pipeline.

    Default: true

    pipelineSettings - Pipeline Settings

    System level settings for IndexPipeline API calls.

    retryOptions - Retry Options

    A set of options for configuring retry behavior.

    delayFactor - number

    The retryer will retry failed operations in the case that they might succeed if attempted again. The retryer will sleep an exponential amount of time after the first failed attempt and retry in exponentially incrementing amounts after each failed attempt up to the maximumTime. nextWaitTime = exponentialIncrement * multiplier.

    >= 1

    <= 100

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 2

    Multiple of: 1

    delayMs - number

    Sets the delay between retries, exponentially backing off to the maxDelayTimeMs and multiplying successive delays by the delayFactor

    >= 1

    <= 9223372036854776000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 1000

    Multiple of: 1

    errorExclusions - array[string]

    Optional regex list that will be matched against failed attempts exception class and message. If any regex matches, do not retry this request. This is needed to prevent the retryer from retrying non-recoverable errors that were not already ignored by the connector implementation.

    maxDelayTimeMs - number

    The maximum time wait time between successive retries.

    >= 1

    <= 600000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 300000

    Multiple of: 1

    maxInMemoryBuffer - number

    The maximum in-memory buffer size (bytes) before content is temporarily spilled-over to disk.This allows the indexing process to retry content without forcing the Connector to provide it again.Content smaller than this setting is buffered in-memory.

    >= 1000

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 4000000

    Multiple of: 1

    maxRetries - number

    The retryer will retry failed operations in the case that they might succeed if attempted again. This parameter states the number of attempts to retry until giving up. This parameter, if specified, will override the "Stop retrying after time (milliseconds)" parameter.

    <= 100

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 3

    Multiple of: 1

    maxTimeLimitMs - number

    This setting is used to limit the maximum amount of time spent on retries. Note: this will be ignored if "Maximum Retries" is specified.

    >= 1

    <= 28800000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 600000

    Multiple of: 1

    description - string

    Optional description

    <= 125 characters

    diagnosticLogging - boolean

    Enable diagnostic logging; disabled by default

    Default: false

    id - stringrequired

    A unique identifier for this Configuration.

    >= 1 characters

    Match pattern: ^[a-zA-Z0-9_-]+$

    parserId - stringrequired

    The Parser to use in the associated IndexPipeline.

    pipeline - stringrequired

    Name of the IndexPipeline used for processing output.

    >= 1 characters

    Match pattern: ^[a-zA-Z0-9_-]+$

    properties - Plugin Configuration

    Plugin specific properties.

    connections - Http client connection options

    A set of options for configuring the http client behavior.

    ignoreSSLValidationExceptions - boolean

    Do not attempt to do an SSL Handshake and do not verify the hostname of SSL certificates. Use this when accessing an https url with a self-signed or enterprise certificate authority that you do not want to put in the Java keystore.

    Default: false

    maxConnections - number

    The maximum number of connections

    >= 1

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 10

    Multiple of: 1

    databaseProperties - Database properties

    A set of options for configuring the connection to the database used for incremental crawls.

    sitecoreDatabaseHost - string

    The Sitecore SQL Server host. Example: 192.168.1.59 Note: This is only needed for faster incremental updates. If not specified, updates will be done with a recrawl strategy.

    sitecoreDatabaseMaxQuerySize - number

    The maximum number of query results that can be returned from sitecore database at once.

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 10000

    Multiple of: 1

    sitecoreDatabaseName - string

    The Sitecore master database's name. Example: sc902_Master

    sitecoreDatabasePassword - string

    The sitecore database user password.

    sitecoreDatabasePort - number

    The Sitecore SQL Server port. Typically this is the sql server default of 1433.

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 1433

    Multiple of: 1

    sitecoreDatabaseUsername - string

    The sitecore database username. Only SELECT access is required for this user on the sc*_Master database for these tables: [Items, ArchivedItems, PublishQueue].

    depthLimitConfig - Item Depth Limits

    maxDepth - number

    Maximum depth level for fetch items. If an item has a depth greater than the configured value, it will not be fetched. The default is "no limit" (-1).

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: -1

    Multiple of: 1

    extensionConfig - File Extension rules

    excludedFileExtensions - array[string]

    A set of all file extensions to be skipped from the fetch.

    Default:

    includedFileExtensions - array[string]

    Set of file extensions to be fetched. If specified, all non-matching files will be skipped.

    Default:

    regexCacheSize - number

    The number of regex matches to cache when evaluating regular expressions. For example if you exclude files by filename, each filename's regex result will be cached so that if this same filename came up again, the regex matches would be remembered.

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 10000

    Multiple of: 1

    fetchRetryProperties - Retry Options

    A set of options for configuring retry behavior.

    delayFactor - number

    The retryer will retry failed operations in the case that they might succeed if attempted again. The retryer will sleep an exponential amount of time after the first failed attempt and retry in exponentially incrementing amounts after each failed attempt up to the maximumTime. nextWaitTime = exponentialIncrement * multiplier.

    >= 1

    <= 100

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 2

    Multiple of: 1

    delayMs - number

    Sets the delay between retries, exponentially backing off to the maxDelayTimeMs and multiplying successive delays by the delayFactor

    >= 1

    <= 9223372036854776000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 1000

    Multiple of: 1

    errorExclusions - array[string]

    Optional regex list that will be matched against failed attempts exception class and message. If any regex matches, do not retry this request. This is needed to prevent the retryer from retrying non-recoverable errors that were not already ignored by the connector implementation.

    maxDelayTimeMs - number

    The maximum time wait time between successive retries.

    >= 1

    <= 600000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 300000

    Multiple of: 1

    maxRetries - number

    The retryer will retry failed operations in the case that they might succeed if attempted again. This parameter states the number of attempts to retry until giving up. This parameter, if specified, will override the "Stop retrying after time (milliseconds)" parameter.

    <= 100

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 3

    Multiple of: 1

    maxTimeLimitMs - number

    This setting is used to limit the maximum amount of time spent on retries. Note: this will be ignored if "Maximum Retries" is specified.

    >= 1

    <= 28800000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 600000

    Multiple of: 1

    indexMetadata - boolean

    If enabled, the item metadata will be added to the indexed document

    Default: true

    itemPaths - array[string]

    A list of Sitecore Item paths to be indexed ex. '/sitecore/content/Home'

    Default: "/sitecore/content/Home"

    mimeTypeProperties - MIME Types

    Options for including or excluding items, based on MIME types.

    excludedMimeTypes - array[string]

    A list of the Mime types to exclude from this crawl. NOTE: This is only used if the "Mime Type Includes" field is empty.

    includedMimeTypes - array[string]

    A list of the Mime types to include in this crawl. Note: If you specify includes, the exclude mime types property will be ignored.

    proxyProperties - Proxy options

    A set of options for configuring the proxy.

    password - string

    Proxy password

    >= 1 characters

    url - string

    The proxy URL

    >= 1 characters

    username - string

    Proxy username

    >= 1 characters

    regexConfig - Regular expression rules

    exclusiveRegexes - array[string]

    Regular expressions for URI patterns to exclude. This will limit this datasource to only URIs that do not match the regular expression.

    Default:

    inclusiveRegexes - array[string]

    Regular expressions for URI patterns to include. This will limit this datasource to only URIs that match the regular expression.

    Default:

    regexCacheSize - number

    The number of regex matches to cache when evaluating regular expressions. For example if you exclude files by filename, each filename's regex result will be cached so that if this same filename came up again, the regex matches would be remembered.

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 10000

    Multiple of: 1

    sitecoreDatabase - string

    Sitecore context database to use when accessing sitecore. For example: master, web, core. If you leave this blank, the connector will pick the default database associated with your current user.

    sitecorePassword - string

    The password of the administrator user

    sitecoreServerURL - string

    Sitecore server root URL (https://<server>)

    sitecoreUsername - string

    An administrator Sitecore username

    sizeLimitProperties - Item Size Limits

    Options for including or excluding items based on size, in bytes.

    maxSizeBytes - number

    Used for excluding items when the item size is larger than the configured value.

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: -1

    Multiple of: 1

    minSizeBytes - number

    Used for excluding items when the item size is smaller than the configured value.

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 1

    Multiple of: 1

    timeouts - Http timeout options

    A set of options for configuring the http client timeouts.

    connectTimeoutMs - number

    <= 300000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 60000

    Multiple of: 1

    readTimeoutMs - number

    <= 600000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 300000

    Multiple of: 1