f.addFileMetadata
Add file metadata
|
Set to true to add information about documents found in the filesystem to the document, such as document owner, group, or ACL permissions.
type: boolean
default value: 'true '
|
f.fs.apiKey
API Key
|
The Box API Key.
type: string
|
f.fs.apiSecret
API Secret
|
The Box API Secret.
type: string
|
f.fs.appUserId
JWT App User ID
|
(JWT only) The JWT App User ID with access to crawl.
type: string
|
f.fs.batchSize
Requests batch size
|
The number of requests to be wrapped into a Box batch request
type: integer
default value: '10 '
exclusiveMaximum: false
exclusiveMinimum: false
maximum: 10
minimum: 1
|
f.fs.childrenPageSize
Box.com children responses per page
|
The number of results to get from Box.com API's children() methods. Default is the max of 1000, Range can be 1-1000.
type: integer
default value: '1000 '
|
f.fs.connectTimeoutMs
API Connection Timeout (ms)
|
The box api connection timeout in milliseconds.
type: integer
default value: '240000 '
|
f.fs.distributedCrawlCollectionName
Distributed crawl collection name
|
The collection name of the Distributed Crawl Collection. If you do not specify one, it will use 'system_box_distributed_crawl'.
type: string
|
f.fs.distributedCrawlDatasourceIndex
Distributed crawl datasource index
|
Distributed Job index. Zero-based index of what distributed job index this data source represents. Must be in range [0, numDatasources]. For example, if you have 3 jobs in an distributed crawl, the index can be either 0, 1 or 2. Each data source must have a unique distributedJobIndex. Once the pre-fetch index is created, this index is used to signify the chunk of the file IDs that this node is responsible for indexing from the Distributed Crawl Collection.
type: integer
default value: '0 '
|
f.fs.excludedExtensions
Exclude extensions
|
Comma separated list of extensions. No Box Files or Folders that have a filename that ends with any of these extensions will be crawled. Case will be ignored. E.g. .txt,.xls,.DS_Store
type: string
|
f.fs.generatedSharedLinksAccess
Generated Shared Link Access
|
Only applicable when Generate Shared Links when Absent is selected... Sets the shared link access setting. Can be left blank (the default) or set to open, company or collaborators
type: string
|
f.fs.generatedSharedLinksExpireDays
Generated Shared Link Expires After Days
|
Only applicable when Generate Shared Links when Absent is selected... this will control how many days the shared links stay valid for. 0 for unlimited.
type: integer
default value: '0 '
|
f.fs.isGenerateSharedLinkPermissionCanDownload
Generated Shared Link Has Download Permission
|
Only applicable when Generate Shared Links when Absent is selected... On the box shared link, is the "can download" permission granted?
type: boolean
|
f.fs.isGenerateSharedLinkPermissionCanPreview
Generated Shared Link Has Preview Permission
|
Only applicable when Generate Shared Links when Absent is selected... On the box shared link, is the "can preview" permission granted?
type: boolean
|
f.fs.isGenerateSharedLinkWhenAbsent
Generate Shared Link When Absent
|
If this is selected, the crawler will automatically create a shared link for any non-shared documents it finds while crawling. Note: This will change all documents to 'Shared' in your Box view. Use with caution.
type: boolean
|
f.fs.max_request_attempts
Box Max Request Retries
|
If Box API throws an error when trying to get a file, how many times do we retry before giving up?
type: integer
default value: '10 '
|
f.fs.nestedFolderDepth
Nested folder depth limit
|
Maximum depth of nested folders that will be crawled. Range: [1, int-max]. Default is int-max.
type: integer
default value: '2147483647 '
|
f.fs.numDistributedDatasources
Number distributed crawl datasources
|
Number of separate datasource jobs that will be running in this distributed crawl. In other words, how many datasources are part of this crawl? This value is needed in order to distribute work evenly amongst multiple jobs.
type: integer
default value: '1 '
|
f.fs.numPreFetchIndexCreationThreads
Number of pre-fetch index creator threads
|
The number of concurrent threads that will create the Distributed Pre-fetch Index. Default: 16
type: integer
default value: '5 '
|
f.fs.numSolrEmitterThreads
Number of Solr emitter threads
|
The number of Solr emitter threads. Default: 4
type: integer
default value: '4 '
|
f.fs.partitionBucketCount
Number of partition buckets
|
Number of partition buckets to be used during the full crawl. Default is 5000.
type: integer
default value: '5000 '
|
f.fs.privateKeyBase64
JWT Private Key (Base64)
|
(JWT only) Content of the private key. To get this value, open your key file and convert its content (including first and last line) to base64 string.
type: string
|
f.fs.privateKeyPassword
JWT Private Key Password
|
(JWT only) The password you entered for the private key file.
type: string
|
f.fs.proxyHost
Proxy host
|
The address to use when connecting through the proxy.
type: string
|
f.fs.proxyPort
Proxy port
|
The port to use when connecting through the proxy.
type: integer
|
f.fs.proxyType
Proxy type
|
Type of proxy to use, if any. Allowed values are 'HTTP' and 'SOCKS'. Leave empty for no proxy.
type: string
|
f.fs.publicKeyId
JWT Public Key Id
|
(JWT only) The public key prefix from the box.com public keys.
type: string
|
f.fs.readTimeoutMs
API Read Timeout (ms)
|
The box api read timeout in milliseconds.
type: integer
default value: '240000 '
|
f.fs.refreshToken
OAuth Refresh Token
|
OAuth Refresh token (Not needed for JWT).
type: string
|
f.fs.refreshTokenFile
OAuth Refresh Token File
|
File that stores the refresh token for the next session.
type: string
default value: 'refresh_token.txt '
|
f.fs.retrievalTimeoutMs
Retrieval Timeout (ms)
|
Timeout before taking items on producer/consumer queues in milliseconds. Default is 1000
type: integer
default value: '1000 '
|
f.fs.user_excludes
User exclude regexes
|
In addition to the user filter, you can here optionally specify regexes matching user names that should not be crawled.
type: array of string
|
f.fs.user_filter_term
User Filter Term
|
If you specify a user filter term, then a users files will only be crawled if their login starts with the user filter term. Can be comma separated list of multiple filter terms. Example: a,b,c,v would be all box users that have a login starting with a,b,c, or v. This value can be empty to return all results.
type: string
|
f.index_items_discarded
Index discarded document metadata
|
Enable to index discarded document metadata
type: boolean
default value: 'false '
|
f.maxSizeBytes
Maximum file size (bytes)
|
Maximum size (in bytes) of documents to fetch or -1 for unlimited file size.
type: integer
default value: '4194304 '
|
f.minSizeBytes
Minimum file size (bytes)
|
Minimum size, in bytes, of documents to fetch.
type: integer
default value: '0 '
|
startLinks
Start Links
|
The IDs of the folders or files to crawl. For example if the URL to your folder is https://app.box.com/folder/12345, then enter 12345. To crawl the entire Box account, enter 0.
type: array of string
|