Configure a Box.com Datasource
private_key.pem
and public_key.pem
respectively.
public_key.pem
file (generated from
Step 4: Configure Your App to Use a Box Service Account) into the text box.
Setting | Notes |
Start Links | Each start link defined for the datasource must consist of a numeric Box file ID or directory ID. The root directory of any Box account has ID 0 (zero). To crawl your entire Box repository, enter ‘0’. These images indicate with underlines where you can get a folder ID or file ID. Select a folder or file at Box.com. Folder ID: ![]() 34192617287 . File ID: ![]() 204871656422 : |
API Key | In the Box Developers Console, select the app. On the Configuration tab under OAuth 2.0 Credentials, use the Client ID. |
API Secret | In the Box Developers Console, select the app. On the Configuration tab under OAuth 2.0 Credentials, use the Client Secret. |
JWT App User ID | Email address that you use to sign in to your Box co-admin account. Use the Co-admin account you created in Step 4. |
JWT Public Key Id | In the Box Developers Console, select the app. On the Configuration tab, under Add and Manage Public Keys, use the ID for a public key. |
JWT Private Key | Base64-encoded contents of the private-key file that matches your JWT Public Key Id. Base64 encode the entire contents of the file, including the first and last lines. (Fusion 5.0+ only.) |
JWT Private Key File | Full path to the private-key file you created that matches your JWT Public Key Id. (Prior to Fusion 5.0 only.) |
JWT Private Key Password | Passphrase for the private key (from the private-key file you created in Step 4). |
Distributed crawl collection name | Collection that contains the pre-fetch index. |
Box.com children responses per page | Use the default value of 1000. |
Nested folder depth limit | Generally, you want a number that will crawl all documents, so keep the default value. For testing, you could reduce the number substantially to speed up the crawl. |
Number of partition buckets | Divide the number of files by 5000. Use that number or 10000, whichever is smaller. |
Number distributed crawl datasources | Use 1 to 27. |
Number of pre-fetch index creator threads | A number between 2 and 5. Use 2 for small datasources and 5 for huge datasources (over 10 million files). |
Configure Box.com Tokens
UI Label, API Name | Description |
JWT App User IDf.fs.appUserId | The Developer Edition API App User ID that you want to crawl as. |
JWT Public Key IDf.fs.publicKeyId | The public key prefix registered in Box Auth that you want to use to authenticate with. |
JWT Private Keyf.fs.privateKeyBase64 | Base64-encoded JWT private key for the app user you want to authenticate as. (Fusion 5.0+ only.) |
JWT Private Key File Pathf.fs.privateKeyFile | Path to the JWT private key file for the app user you want to authenticate as. (Prior to Fusion 5.0 only.) |
JWT Private Key File Passwordf.fs.privateKeyPassword | The password that secures the public key. |
http://localhost
or http://0.0.0.0
. This address is not used by Fusion, but cannot be left blank.system_box_distributed_crawl
, which is shared by all Box.com datasources.
The pre-fetch index lets the Box connector crawl files randomly, file-by-file; instead of user-by-user. This gets around Box rate limits.
system_box_distributed_crawl
collection manually:
system_box_distributed_crawl
, and then click the Configure f.fs.distributedCrawlCollectionName
) in your datasource configuration.
startLinks
defined for the datasource must include the numeric Box file and directory IDs. The root directory of any Box account has an ID of 0 (zero). If you want to crawl your entire Box repository, you should enter ‘0’.