Product Selector

Fusion 5.12
    Fusion 5.12

    JDBC V2Connector Configuration Reference

    Table of Contents

    Connect to any JDBC database.

    This connector requires a JDBC driver to be loaded before creating the datasource.

    Remote connectors

    V2 connectors support running remotely in Fusion versions 5.7.1 and later. Refer to Configure Remote V2 Connectors.

    Below is an example configuration showing how to specify the file system to index under the connector-plugins entry in your values.yaml file:

    additionalVolumes:
    - name: fusion-data1-pvc
        persistentVolumeClaim:
        claimName: fusion-data1-pvc
    - name: fusion-data2-pvc
        persistentVolumeClaim:
        claimName: fusion-data2-pvc
    additionalVolumeMounts:
    - name: fusion-data1-pvc
        mountPath: "/connector/data1"
    - name: fusion-data2-pvc
        mountPath: "/connector/data2"

    You may also need to specify the user that is authorized to access the file system, as in this example:

    securityContext:
        fsGroup: 1002100000
        runAsUser: 1002100000

    Configuration

    When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.

    Connector to index content in a database.

    description - string

    Optional description

    <= 125 characters

    pipeline - stringrequired

    Name of the IndexPipeline used for processing output.

    >= 1 characters

    Match pattern: ^[a-zA-Z0-9_-]+$

    diagnosticLogging - boolean

    Enable diagnostic logging; disabled by default

    Default: false

    parserId - stringrequired

    The Parser to use in the associated IndexPipeline.

    coreProperties - Core Properties

    Common behavior and performance settings.

    fetchSettings - Fetch Settings

    System level settings for controlling fetch behavior and performance.

    numFetchThreads - number

    Maximum number of fetch threads; defaults to 5.This setting controls the number of threads that call the Connectors fetch method.Higher values can, but not always, help with overall fetch performance.

    >= 1

    <= 500

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 5

    Multiple of: 1

    indexingThreads - number

    Maximum number of indexing threads; defaults to 4.This setting controls the number of threads in the indexing service used for processing content documents emitted by this datasource.Higher values can sometimes help with overall fetch performance.

    >= 1

    <= 10

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 4

    Multiple of: 1

    pluginInstances - number

    Maximum number of plugin instances for distributed fetching. Only specified number of plugin instanceswill do fetching. This is useful for distributing load between different instances.

    <= 500

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 0

    Multiple of: 1

    fetchResponseScheduledTimeout - number

    The maximum amount of time for a response to be scheduled. The task will be canceled if this setting is exceeded.

    >= 1000

    <= 500000

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 300000

    Multiple of: 1

    indexingInactivityTimeout - number

    The maximum amount of time to wait for indexing results (in seconds). If exceeded, the job will fail with an indexing inactivity timeout.

    >= 60

    <= 691200

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 86400

    Multiple of: 1

    pluginInactivityTimeout - number

    The maximum amount of time to wait for plugin activity (in seconds). If exceeded, the job will fail with a plugin inactivity timeout.

    >= 60

    <= 691200

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 600

    Multiple of: 1

    indexMetadata - boolean

    When enabled the metadata of skipped items will be indexed to the content collection.

    Default: false

    indexContentFields - boolean

    When enabled, content fields will be indexed to the crawl-db collection.

    Default: false

    asyncParsing - boolean

    When enabled, content will be indexed asynchronously.

    Default: false

    id - stringrequired

    A unique identifier for this Configuration.

    >= 1 characters

    Match pattern: ^[a-zA-Z0-9_-]+$

    properties - JDBC properties

    Plugin specific properties.

    url - string

    A URL to the database, e.g., 'jdbc:mysql://localhost/test'

    driver - string

    The class name of the JDBC driver to use to connect to the database.

    query - string

    A SQL SELECT statement to choose the records to be retrieved. For paginated queries, use the special variables ${limit} and ${offset}. The specific syntax will be driver dependent. Examples: Mysql - SELECT * FROM test_table LIMIT ${limit} OFFSET ${offset}, Microsoft SQLServer - SELECT * FROM test_table ORDER BY primary_key OFFSET ${offset} FETCH NEXT ${limit} ROWS ONLY

    limit - number

    Limiter to paginate the results returned. Max amount of records to return at a time

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 10000

    Multiple of: 1

    offset - number

    Starting number for records fetch. Such as starting from the 10th record

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 0

    Multiple of: 1

    disableAutomaticPagination - boolean

    Disable automatic pagination such that limit and offset fields will be ignored

    Default: false

    deltaQuery - string

    An optional second query that will be used on re-crawls. It is intended to be used to save performance by returning only new/modified items. When used with Stray Content Deletion, any item not returned by this query will be deleted from the content collection after each re-crawl.

    nestedQueries - array[string]

    A nested query to join data from multiple tables. The nested query will be used with the SQL Statement and must include the primary key with the ${id} template. For example, 'SELECT column FROM table_fk WHERE pk_id=${id}'.

    strayContentDeletionEnabled - boolean

    When this feature is enabled, items not returned by the SQL query used for the crawl will be deleted from the content collection after each crawl. If it is disabled, no checks are performed for data freshness.

    Default: true

    primaryKey - string

    The column name of the primary key for the table.

    Default: id

    username - string

    The username of a database account used for authentication and data access.

    password - string

    The password of the account used for authentication and data access.

    batchSize - number

    Size of the batch during the statement execution

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 1000

    Multiple of: 1

    maxConnections - number

    Maximum of connections in parallel to be used during statements execution

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 20

    Multiple of: 1

    nestedQueryTimeOut - number

    Timeout in minutes while executing a nested query statement

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: 1

    Multiple of: 1

    skipValidation - boolean

    Sometimes the validation can take long time, it is better to skip it.

    Default: false

    maximumItemLimitConfig - Item Count Limits

    maxItems - number

    Limits the number of items emitted to the configured IndexPipeline. The default is no limit (-1).

    >= -2147483648

    <= 2147483647

    exclusiveMinimum: false

    exclusiveMaximum: false

    Default: -1

    Multiple of: 1