Box.com Connector and Datasource Configuration

The Box connector retrieves data from a Box.com cloud-based data repository. To fetch content from Box.com, you must register an application in order to obtain the proper authorization tokens.

The startLinks defined for the datasource must include the Box numeric file and directory IDs. The root directory of any Box account has ID 0 (zero) - if you want to crawl your entire Box repository, you should enter '0'.

During the crawl, the human-readable path (e.g., /FolderName/FileName.txt) will be added to each document as the title field.

See a tutorial about the complete configuration process here (full-screen recommended):

Box Authorization, Access, and Refresh Tokens

Fusion supports two methods of authentication with the Box API:

  • OAuth2

  • JSON Web Token (JWT)

Auth Option 1: Box App Auth Using OAuth2

In this option, Fusion will use OAuth 2.0 to authenticate to a normal box.com user when accessing the Box API.

Fusion needs the inputs below to Crawl your Box data.

Required options are highlighted.

UI Label,
API Name
Description

OAuth Refresh Token
f.fs.refreshToken

The refresh token requires us to authenticate using the Box.com account you want to do the crawling to get it. To speed the process, we created a handy downloadable utility. Use it like this:

  1. Launch the executable JAR:

    java -jar /path/to/box-oauth-generator-1.0.jar
  2. Enter the requested data.

  3. Click Get Redirect Token.

    The utility obtains the refresh_token from Box.com using the specified credentials; it’s valid for about 60 days.

Box OAuth Refresh Token File
f.fs.refreshTokenFile

The filename in which to save the refresh token.

Tip
If you crawl your Box files at intervals of less than 60 days, Fusion will automatically update the refresh_token and store it in $FUSION/data/connectors/container/lucid.anda/<datasourceID>, where <datasourceID> is the ID of the datasource.

Auth Option 2: Box App Users Using JWT

Box.com has rather recently released a Box Developer Edition. The Box Developer Edition offers a new functionality where app users will no longer have to create their own Box accounts to use an application.

App Auth uses the JSON Web Token (JWT) authentication architecture to establish a trusted connection with Box, allowing an application to provision and manage a new type of Box account that removes the friction of multiple logins for users or the difficulty of managing services.

For this option, Fusion needs the inputs below to crawl your Box data.

Required options are highlighted.

UI Label,
API Name
Description

JWT App User ID
f.fs.appUserId

The Developer Edition API App User ID that you want to crawl as.

JWT Public Key ID
f.fs.publicKeyId

The public key prefix registered in Box Auth that you want to use to authenticate with.

JWT Private Key File Path
f.fs.privateKeyFile

The Developer Edition API App User ID that you want to crawl as.

JWT Private Key File Password
f.fs.privateKeyPassword

The password that secures the public key.

Tip
The biggest advantage to using the JWT App Auth Users approach is that you don’t have to generate new refresh tokens. The public/private key file combination remain valid indefinitely.

Configuration

Tip
When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.