Blob Store API

The Blob Store REST API allows storing binary objects in Solr. The primary use case for this is to store entity extraction models, lookup lists or exclusion lists for use in index pipelines. This may include the entity extraction models and lookup lists included with Fusion in the fusion/data/nlp directory, or files that you have created on your own.

Blobs uploaded to Solr with this REST API are stored in the 'system_blobs' collection.

Get, Upload, Update or Delete a Blob

The path for this request is:

/api/apollo/blobs

  • GET - list all blobs.

  • PUT - upload a new blob with your defined ID, or update an existing blob. A PUT request will allow you to define the name of the blob while uploading it to Solr, so this is the preferred method for adding blobs.

  • POST - uploaded a new blob, defined ID is included in request body.

  • DELETE request will remove this blob.

List a Blob Manifest

The path for this request is:

/api/apollo/blobs/<id>/manifest

where <id> is the name of a specific blob.

  • GET - returns the manifest this item in the blob store. Output list the blog name (which is the blob ID), the content-type, size, when it was last modified, the document version in Solr, and an md5 hash for the document.

Examples

Upload a file to the blob store:

REQUEST

curl -u user:pass -X PUT --data-binary @airports.lst -H 'Content-type: text/plain' http://localhost:8764/api/apollo/blobs/airports.lst

RESPONSE

{
  "name" : "airports.lst",
  "contentType" : "text/plain",
  "size" : 66,
  "modifiedTime" : "2014-12-03T22:26:16.436Z",
  "version" : 0,
  "md5" : "fbe581898cb426f6bdcabc3f52253594"
}

Upload an OpenNLP sentence model binary file to the blob store:

REQUEST

curl -u user:pass -X PUT --data-binary @data/nlp/models/en-sent.bin -H 'Content-type: application/octet-stream' http://localhost:8764/api/apollo/blobs/sentenceModel.bin
Note
In this example that we have changed the name of the blob during upload by giving it a different ID. The file is named 'en-sent.bin' but we have defined the ID of this to 'sentenceModel.bin'. When we use this blob in an index pipeline, we would refer to it by the ID we have given it.

Get the manifest for a sentence OpenNLP model we’ve previously saved in the blob store:

REQUEST

curl -u user:pass http://localhost:8764/api/apollo/blobs/sentenceModel.bin/manifest

RESPONSE

{
  "name" : "sentenceModel.bin",
  "contentType" : "application/octet-stream",
  "size" : 98533,
  "modifiedTime" : "2014-09-08T18:50:07.559Z",
  "version" : 1478704189996531712,
  "md5" : "3822c5f82cb4ba139284631d2f6b7fde"
}

Upload a JDBC driver, using slashes in the blob name:

REQUEST

curl -u user:pass -X PUT --data-binary @mysql-connector-java-5.1.42-bin.jar -H 'Content-length: 996444' -H 'Content-Type: application/zip' http://localhost:8764/api/apollo/blobs/good/to/go/mysql-connector-java-5.1.42-bin.jar?resourceType=driver:jdbc

RESPONSE

{
  "name" : "good/to/go/mysql-connector-java-5.1.42-bin.jar",
  "contentType" : "application/zip",
  "size" : 996444,
  "modifiedTime" : "2017-04-04T15:58:32.856Z",
  "version" : 0,
  "md5" : "b1946ac92492d2347c6235b4d2611184",
  "metadata" : {
    "subtype" : "driver:jdbc",
    "resourceType" : "driver:jdbc"
  }
}

Get the JDBC driver that was uploaded:

REQUEST

curl -u user:pass -H "Accept: application/zip" http://localhost:8764/api/apollo/blobs/jtds-1.3.1-src.zip?resourceType=driver:jdbc  -o jtds-1.3.1-src.zip

RESPONSE

[ {
  "name" : "good/to/go/sentenceModel.bin",
  "contentType" : "application/octet-stream",
  "size" : 6,
  "modifiedTime" : "2017-04-04T06:21:53.465Z",
  "version" : 1563727666574524416,
  "md5" : "b1946ac92492d2347c6235b4d2611184",
  "metadata" : {
    "subtype" : "driver:jdbc",
    "resourceType" : "driver:jdbc"
  }
} ]