Blob Store API
The Blob Store REST API allows storing binary objects in Solr. The primary use case for this is to store entity extraction models, lookup lists or exclusion lists for use in index pipelines. This may include the entity extraction models and lookup lists included with Fusion in the fusion/data/nlp
directory, or files that you have created on your own.
Blobs uploaded to Solr with this REST API are stored in the 'system_blobs' collection.
Get, Upload, Update or Delete a Blob
The path for this request is:
/api/apollo/blobs
-
GET - list all blobs.
-
PUT - upload a new blob with your defined ID, or update an existing blob. A PUT request will allow you to define the name of the blob while uploading it to Solr, so this is the preferred method for adding blobs.
-
POST - uploaded a new blob, defined ID is included in request body.
-
DELETE request will remove this blob.
List a Blob Manifest
The path for this request is:
/api/apollo/blobs/<id>/manifest
where <id> is the name of a specific blob.
-
GET - returns the manifest this item in the blob store. Output list the blog name (which is the blob ID), the content-type, size, when it was last modified, the document version in Solr, and an md5 hash for the document.
Examples
Upload a file to the blob store:
REQUEST
curl -u user:pass -X PUT --data-binary @airports.lst -H 'Content-type: text/plain' http://localhost:8764/api/apollo/blobs/airports.lst
RESPONSE
{ "name" : "airports.lst", "contentType" : "text/plain", "size" : 66, "modifiedTime" : "2014-12-03T22:26:16.436Z", "version" : 0, "md5" : "fbe581898cb426f6bdcabc3f52253594" }
Upload an OpenNLP sentence model binary file to the blob store:
REQUEST
curl -u user:pass -X PUT --data-binary @data/nlp/models/en-sent.bin -H 'Content-type: application/octet-stream' http://localhost:8764/api/apollo/blobs/sentenceModel.bin
Note
|
In this example that we have changed the name of the blob during upload by giving it a different ID. The file is named 'en-sent.bin' but we have defined the ID of this to 'sentenceModel.bin'. When we use this blob in an index pipeline, we would refer to it by the ID we have given it. |
Get the manifest for a sentence OpenNLP model we’ve previously saved in the blob store:
REQUEST
curl -u user:pass http://localhost:8764/api/apollo/blobs/sentenceModel.bin/manifest
RESPONSE
{ "name" : "sentenceModel.bin", "contentType" : "application/octet-stream", "size" : 98533, "modifiedTime" : "2014-09-08T18:50:07.559Z", "version" : 1478704189996531712, "md5" : "3822c5f82cb4ba139284631d2f6b7fde" }
Upload a JDBC driver, using slashes in the blob name:
REQUEST
curl -u user:pass -X PUT --data-binary @mysql-connector-java-5.1.42-bin.jar -H 'Content-length: 996444' -H 'Content-Type: application/zip' http://localhost:8764/api/apollo/blobs/good/to/go/mysql-connector-java-5.1.42-bin.jar?resourceType=driver:jdbc
RESPONSE
{
"name" : "good/to/go/mysql-connector-java-5.1.42-bin.jar",
"contentType" : "application/zip",
"size" : 996444,
"modifiedTime" : "2017-04-04T15:58:32.856Z",
"version" : 0,
"md5" : "b1946ac92492d2347c6235b4d2611184",
"metadata" : {
"subtype" : "driver:jdbc",
"resourceType" : "driver:jdbc"
}
}
Get the JDBC driver that was uploaded:
REQUEST
curl -u user:pass -H "Accept: application/zip" http://localhost:8764/api/apollo/blobs/jtds-1.3.1-src.zip?resourceType=driver:jdbc -o jtds-1.3.1-src.zip
RESPONSE
[ {
"name" : "good/to/go/sentenceModel.bin",
"contentType" : "application/octet-stream",
"size" : 6,
"modifiedTime" : "2017-04-04T06:21:53.465Z",
"version" : 1563727666574524416,
"md5" : "b1946ac92492d2347c6235b4d2611184",
"metadata" : {
"subtype" : "driver:jdbc",
"resourceType" : "driver:jdbc"
}
} ]