Blob Store API

The Blob Store REST API allows storing binary objects in Solr. The primary use case for this is to store entity extraction models, lookup lists or exclusion lists for use in index pipelines. This may include the entity extraction models and lookup lists included with Fusion in the $FUSION/data/nlp directory, or files that you have created on your own.

Blobs uploaded to Solr with this REST API are stored in the 'system_blobs' collection.

Blob Types

A resourceType query parameter can be used to specify the a blob type. For example, specify plugin:connector when uploading a connector, like this:

curl -H 'content-type:application/zip' -X PUT 'localhost:8765/api/v1/blobs/myplugin?resourceType=plugin:connector' —data-binary @myplugin.zip

The complete list of valid values for resourceType is below:

Type Description

catalog

An analytics catalog

driver:jdbc

A JDBC driver

plugin:connector

A connector plugin

model:ml-model

A machine learning model

model:open-nlp

An OpenNLP model

file-upload

Any uploaded file, such as from the Quickstart or the Index Workbench.

banana

A Banana dashboard

other

A blob of unknown type

If no resourceType is specified on upload, "other" is assigned by default.

Examples

Upload a file to the blob store:

REQUEST

curl -u user:pass -X PUT --data-binary @airports.lst -H 'Content-type: text/plain' http://localhost:8764/api/apollo/blobs/airports.lst

RESPONSE

{
  "name" : "airports.lst",
  "contentType" : "text/plain",
  "size" : 66,
  "modifiedTime" : "2014-12-03T22:26:16.436Z",
  "version" : 0,
  "md5" : "fbe581898cb426f6bdcabc3f52253594"
}

Upload an OpenNLP sentence model binary file to the blob store:

REQUEST

curl -u user:pass -X PUT --data-binary @data/nlp/models/en-sent.bin -H 'Content-type: application/octet-stream' http://localhost:8764/api/apollo/blobs/sentenceModel.bin
Note
In this example that we have changed the name of the blob during upload by giving it a different ID. The file is named 'en-sent.bin' but we have defined the ID of this to 'sentenceModel.bin'. When we use this blob in an index pipeline, we would refer to it by the ID we have given it.

Get the manifest for a sentence OpenNLP model we’ve previously saved in the blob store:

REQUEST

curl -u user:pass http://localhost:8764/api/apollo/blobs/sentenceModel.bin/manifest

RESPONSE

{
  "name" : "sentenceModel.bin",
  "contentType" : "application/octet-stream",
  "size" : 98533,
  "modifiedTime" : "2014-09-08T18:50:07.559Z",
  "version" : 1478704189996531712,
  "md5" : "3822c5f82cb4ba139284631d2f6b7fde"
}

Upload a JDBC driver, using slashes in the blob name:

REQUEST

curl -u user:pass -X PUT --data-binary @en-sent.bin -H 'Content-length: 6' -H 'Content-type: application/octet-stream' http://localhost:3000/api/apollo/blobs/good/to/go/sentenceModel.bin?resourceType=driver:jdbc

RESPONSE

{
  "name" : "good/to/go/sentenceModel.bin",
  "contentType" : "application/octet-stream",
  "size" : 6,
  "modifiedTime" : "2017-04-04T15:58:32.856Z",
  "version" : 0,
  "md5" : "b1946ac92492d2347c6235b4d2611184",
  "metadata" : {
    "subtype" : "driver:jdbc",
    "resourceType" : "driver:jdbc"
  }
}

Get the JDBC driver that was uploaded:

REQUEST

curl -u admin:password123http://localhost:3000/api/apollo/blobs?resourceType=driver:jdbc

RESPONSE

[ {
  "name" : "good/to/go/sentenceModel.bin",
  "contentType" : "application/octet-stream",
  "size" : 6,
  "modifiedTime" : "2017-04-04T06:21:53.465Z",
  "version" : 1563727666574524416,
  "md5" : "b1946ac92492d2347c6235b4d2611184",
  "metadata" : {
    "subtype" : "driver:jdbc",
    "resourceType" : "driver:jdbc"
  }
} ]