Connector Configuration Reference
Specify buckets to crawl
setting. First add the name of the buckets you would like to crawl, then download bucket objects and metadata.
Configure remote V2 connectors
remote-connectors
or admin
role.remote-connectors
role by default, you can create one. No API or UI permissions are required for the role.values.yaml
file, configure this section as needed:enabled
to true
to enable the backend ingress.
pathtype
to Prefix
or Exact
.
path
to the path where the backend will be available.
host
to the host where the backend will be available.
ingressClassName
to one of the following:
nginx
for Nginx Ingress Controlleralb
for AWS Application Load Balancer (ALB)logging.config
property is optional. If not set, logging messages are sent to the console.plain-text
to true
.connectors-backend
pod shuts down and is replaced by a new pod. Once the connector shuts down, connector configuration and job execution are disabled. To prevent that from happening, you should restart the connector as soon as possible.You can use Linux scripts and utilities to restart the connector automatically, such as Monit.max-grpc-retries
bridge parameters.job-expiration-duration-seconds
parameter. The default value is 120
seconds.connector-plugins
entry in your values.yaml
file:
Name | Title | Description |
---|---|---|
authenticationProperties | Authentication settings | Connect to the bucket store using a service account. The service account requires the following permissions: storage.buckets.list to crawl all the available buckets; storage.objects.list and storage.objects.get to access to the objects in the buckets. |
applicationProperties | Limit documents | Bucket and Object filtering options. |
jsonKey | Service account Json key | Json key contents from authorized service account. |
buckets | Bucket list | Add the bucket names to crawl. Leave blank to crawl all the available buckets. |
includedFileExtensions | Included file extensions | Set of file extensions to be fetched. If specified all non-matching files will be skipped. |
excludedFileExtensions | Excluded file extensions | A set of all file extensions to be skipped from the fetch. |
inclusiveRegexes | Inclusive regexes | Regular expressions for bucket or object name patterns to include. This will limit this datasource to only items that match the regular expression. |
exclusiveRegexes | Exclusive regexes | Regular expressions for bucket or object name patterns to exclude. This will limit this datasource to only items that do not match the regular expression. |
maxSizeBytes | Maximum File Size | Used for excluding objects when the objects size is larger than the configured value. |
minSizeBytes | Minimum File Size | Used for excluding objects when the objects size is smaller than the configured value. |
bucketPrefix | Bucket prefix | Filter results to buckets whose names begin with this prefix. Useful only when ‘Bucket List’ property is empty. |
blobsPrefix | Object prefix | Filter results to objects whose names begin with this prefix. |
pageSize | Buckets and Objects page size | Maximum number of buckets or objects returned per page. |