Deploy a Pre-trained scispaCy Model

Table of Contents

Create the model (OPTIONAL)
Deploy model to Fusion
Import sample data
Create a Machine Learning stage in the Index Workbench
Verify results

This article uses a pre-packaged model, which you do not need to download to deploy. To use the pre-packaged model, skip to Deploy model to Fusion. The section Create the model describes how to complete this process on your own.

Create the model (OPTIONAL)

Skip this section to use the pre-packaged model.

Download the scispacy.ipynb file and open it in Jupyter Notebook (or a similar alternative).
Follow the steps in the notebook, substituting your custom values as needed.

Deploy model to Fusion

Navigate to Collections > Jobs.
Click the Add button.
Select the Create Seldon Core Model Deployment under Model Deployment Jobs.

Enter the values for your model deployment. If you are using the pre-packaged model, use the following values:

Parameter Value

Parameter	Value
Jobs ID	`scispacymodel-seldon-deployment`
Model Name	`scispacymodel`
Docker Repository	`shahanesanket`
Image Name	`scispacymodel-grpc:1.0`
Output Column Names for Model	`[entities]`

Jobs ID

scispacymodel-seldon-deployment

Model Name

scispacymodel

Docker Repository

shahanesanket

Image Name

scispacymodel-grpc:1.0

Output Column Names for Model

[entities]

Seldon Core job config

Run the job by clicking Run and selecting Start.

When the job completes successfully, the model is deployed. Check the list of microservices to verify:

Fusion microservices

Import sample data

Download and save the sample data file sampleJSON_body_content.csv.
Navigate to Indexing > Datasources.
Click the Add button and select File Upload V2.
Click Browse, select the sampleJSON_body_content.csv file, and click Open. Click the Upload File button to complete the upload process.
Assign a value to the Datasource ID parameter. This article uses the ID sample-data.
Click the Save button.

Create a Machine Learning stage in the Index Workbench

Navigate to Indexing > Index Workbench.
Click the Load button.
Choose the datasource you created.
Click the Add a Stage button.
Choose the Machine Learning stage.
In the Model ID field, enter scispacymodel.

In the Model input transformation script field, enter the following script:

var modelInput = new java.util.HashMap()
var list = new java.util.ArrayList()
list.add(doc.getFirstFieldValue("body_t"))
modelInput.put("text", list)
modelInput

In the Model output transformation script field, enter the following script:
```
doc.addField("entities_ss", modelOutput.get("entities"))
```
Click the Apply button.

Verify results

Click the Start Job button, and allow the job to finish.
Check the simulated results. If everything was successful, the results will resemble this: