Product Selector

Fusion 5.12
    Fusion 5.12

    Configure a Box.com Datasource

    The Box connector retrieves data from a Box.com cloud-based data repository.

    Configuration overview

    These steps are for a multi-user Box.com data repository. For limited testing using a single user account, you can create a Box app that uses Standard OAuth 2.0 authentication.

    Following is an overview of the steps required to set up Box and Managed Fusion, and to crawl a Box data repository.

    Set Up Box:
    1. Sign up for a Box developer account.

    2. Enable 2-step verification.

    3. Create a Box app that Managed Fusion can use to crawl the Box files.

    4. Configure your app to use a Box service account.

    Set Up Managed Fusion:
    1. Install Managed Fusion’s Box Connector.

    2. Create datasources in Managed Fusion that use the Box connector.

    Crawl a Box Data Repository:
    1. Crawl the Managed Fusion datasources.

    Set Up Box

    Set up Box so that Managed Fusion can crawl Box data repositories.

    Step 1: Sign Up for a Box Developer Account

    If you already have an account, proceed to Step 2.

    1. Open the Box Developers Console.

    2. In the top right corner, click Sign Up.

    3. Select an appropriate Platform Developer plan.

    4. Enter the requested information and click Submit.

    5. Open the confirmation email and click Verify Email.

    6. Log in to your Box account.

    Step 2: Enable 2-Step Verification

    1. Log in to your Box developer account.

      1. Open the Box Developers Console.

      2. Log in as the admin.

    2. Create the Box account that you want to use for crawling.

      1. Open the Users page in the Box admin console.

      2. Click +Users to create a new user account.

      3. Enter the Name and Email for the user, and then click Add user.

      4. Click the user you just created to enter its user settings.

      5. Make this user a Co-Admin by selecting Co-Admin checkbox. Once clicked, a pane titled "User is granted the following administrative privileges" appears. Select all of the following:

        • Manage users

        • Manage groups

        • View users' content

        • Log in to users' accounts

        • Run new reports and access existing reports

          accept screen for Box connector permissions

      6. Click Save.

      7. Close the Admin Console browser tab.

    3. Enable 2-step verification for unrecognized logins:

      1. Open the Account Settings page. (You can reach this page from the drop-down menu under your initials.)

      2. On the Account tab, under Authentication, select Require 2-step verification for unrecognized logins.

      3. Choose your Country and enter a Mobile Phone Number, and then click Continue.

      4. Enter the verification code you receive, and then click Continue.

      5. If you are using a new mobile device, Box will send you a second code. Enter it, and then click Submit.

      6. Click Save Changes.

    Step 3: Create a Box App that Managed Fusion Can Use to Crawl the Box files

    Create a Box app that uses OAuth 2.0 with JWT server authentication.

    If you already have an app, configure it.

    1. Open the Box Developers Console.

    2. Click Create New App.

    3. Select Custom App, and then click Next.

    4. Click OAuth 2.0 with JWT (Server Authentication), and then click Next.

    5. Name your app, and then click Create App. The name must be globally unique across all apps created by all Box users.

    6. Click View Your App.

    Step 4: Configure Your App to Use a Box Service Account

    1. Use OpenSSL to create a private/public key pair:

      1. Install OpenSSL if you need to. Windows instructions are here.

      2. Open a Command Prompt window and run these commands to generate a private/public key pair:

        openssl genrsa -aes128 -out private_key.pem 2048

        Enter a password for the private key when prompted.

        openssl rsa -pubout -in private_key.pem -out public_key.pem

        In the current directory of the Command Prompt, you now have private and public key files, private_key.pem and public_key.pem respectively.

    2. Open the Box Developers Console, log in as Admin if you are asked to log in, and click your app.

    3. In the left navigation menu, click Configuration.

    4. Configure scopes and advanced features:

      1. Under Application Access, select Enterprise.

      2. Under Application Scopes, deselect Manage groups.

      3. Under Advanced Features, enable Generate User Access Tokens and Perform Actions as Users.

      4. Click Save Changes.

    5. In the Add and Manage Public Keys area, click Add a Public Key and paste the contents of the public_key.pem file (generated from the Box key creation) into the text box.

      1. Make a note of the new Public Key ID that you just created.

    6. Under OAuth 2.0 Credentials, click COPY for the Client ID.

    7. Authorize your app:

      1. Open the Box Admin Console.

      2. In the left navigation menu, click Settings > Enterprise Settings (or Business Settings) > Apps.

      3. Under Custom Applications, click Authorize New App.

      4. In the API Key box, paste the Client ID credential you copied in step 6, and then click Next.

      5. Read the App Authorization dialog and click Authorize.

      6. Close the Admin Console browser tab.

      If you change your app’s configuration later, you must repeat this step to re-authorize your app.
    8. Close the Dev Console browser tab.

    Set Up Managed Fusion

    Set up Managed Fusion to crawl Box data repositories.

    Step 5: Create Datasources

    Create datasources that use the Box connector to access the Box data repository.

    For each datasource:

    1. In the Managed Fusion UI, Navigate to Indexing > Datasources.

    2. Click Add.

    3. Select Box (V2) under Not Installed.

    4. Select Box (V2) again under Installed.

    5. Fill in the form. Note the following regarding configuration settings to use:

      Setting Notes

      Start Links

      Each start link defined for the datasource must consist of a numeric Box file ID or directory ID. The root directory of any Box account has ID 0 (zero). To crawl your entire Box repository, enter '0'. These images indicate with underlines where you can get a folder ID or file ID. Select a folder or file at Box.com.

      Folder ID:

      Folder ID

      Enter the start link 34192617287.

      File ID:

      File ID

      Enter the start link 204871656422:

      API Key

      In the Box Developers Console, select the app. On the Configuration tab under OAuth 2.0 Credentials, use the Client ID.

      API Secret

      In the Box Developers Console, select the app. On the Configuration tab under OAuth 2.0 Credentials, use the Client Secret.

      JWT App User ID

      Email address that you use to sign in to your Box co-admin account. Use the Co-admin account you created earlier for this.

      JWT Public Key Id

      In the Box Developers Console, select the app. On the Configuration tab, under Add and Manage Public Keys, use the ID for a public key.

      JWT Private Key

      Base64-encoded contents of the private-key file that matches your JWT Public Key Id. Base64 encode the entire contents of the file, including the first and last lines.

      JWT Private Key Password

      Passphrase for the private key (from the private-key file you created during Box key creation).

      Distributed crawl collection name

      Collection that contains the pre-fetch index.

      Box.com children responses per page

      Use the default value of 1000.

      Nested folder depth limit

      Generally, you want a number to crawl all documents, so keep the default value. For testing, you could reduce the number substantially to speed up the crawl.

      Number of partition buckets

      Divide the number of files by 5000. Use that number or 10000, whichever is smaller.

      Number distributed crawl datasources

      Use 1 to 27.

      Number of pre-fetch index creator threads

      A number between 2 and 5. Use 2 for small datasources and 5 for huge datasources (over 10 million files).

    6. Click Save.

    Crawl a Box Data Repository

    Crawl a Box data repository.

    Step 6: Crawl the Managed Fusion Datasources

    Crawl the datasources, which use Managed Fusion’s Box connector to access the Box data repository. Managed Fusion’s Box connector uses the pre-fetch index to fetch the contents of each file from Box.com, get metadata from both the distributed index and Box.com, and index the documents through Managed Fusion’s index pipeline.

    You can:

    • Run the crawl now.

      1. From the Managed Fusion launcher, click Search > Home Home > Datasources.

      2. Click the datasource.

      3. Click Start Crawl.