Develop and Deploy a Machine Learning Model
pip install seldon-core
docker run
with a specified port, like 9000, which you can then curl
to confirm functionality in Fusion.
See the testing example below.torch.save(model, PATH)
function.
See Saving and Loading Models in the PyTorch documentation.init
: The init
function is where models, tokenizers, vectorizers, and the like should be set to self for invoking.
It is recommended that you include your model’s trained parameters directly into the Docker container rather than reaching out to external storage inside init
.
predict
: The predict
function processes the field or query that Fusion passes to the model.
The predict
function must be able to handle any text processing needed for the model to accept input invoked in its model.evaluate()
, model.predict()
, or equivalent function to get the expected model result.
If the output needs additional manipulation, that should be done before the result is returned.
For embedding models the return value must have the shape of (1, DIM), where DIM (dimension) is a consistent integer, to enable Fusion to handle the vector encoding into Milvus or Solr.
mini.py
and the class name is mini()
.requirements.txt
file is a list of installs for the Dockerfile
to run to ensure the Docker container has the right resources to run the model.import
statement in your Python file, it should be included in the requirements file.An easy way to populate the requirements is by using in the following command in the terminal, inside the directory that contains your code:pip freeze
, you must manually add seldon-core
to the requirements file because it is not invoked in the Python file but is required for containerization.<your_model>.py
, Dockerfile
, and requirements.txt
files, you need to run a few Docker commands.
Run the commands below in order:Parameter | Description |
---|---|
Job ID | A string used by the Fusion API to reference the job after its creation. |
Model name | A name for the deployed model. This is used to generate the deployment name in Seldon Core. It is also the name that you reference as a model-id when making predictions with the ML Service. |
Model replicas | The number of load-balanced replicas of the model to deploy; specify multiple replicas for a higher-volume intake. |
Docker Repository | The public or private repository where the Docker image is located. If you’re using Docker Hub, fill in the Docker Hub username here. |
Image name | The name of the image with an optional tag. If no tag is given, latest is used. |
Kubernetes secret | If you’re using a private repository, supply the name of the Kubernetes secret used for access. |
Output columns | A list of column names that the model’s predict method returns. |
<seldon_model_name>_sdep.yaml
file.kubectl get sdep
gets the details for the currently running Seldon Deployment job and saves those details to a YAML file. kubectl apply -f open_sdep.yaml
adds the key to the Seldon Deployment job the next time it launches.
sdep
before redeploying the model. The currently running Seldon Deployment job does not have the key applied to it. Delete it before redeploying and the new job will have the key.
Inner Product
, but this also depends on use case and model type.Train a Smart Answers Supervised Model
GPU | CPU |
|
|
model_training_input
. Otherwise you can use it directly from the cloud storage.
word_custom
or bpe_custom
. This trains Word2vec on the provided data and specified fields. It might be useful in cases when your content includes unusual or domain-specific vocabulary.
If you have content in addition to the query/response pairs that can be used to train the model, then specify it in the Texts Data Path.
When you use the pre-trained embeddings, the log shows the percentage of processed vocabulary words. If this value is high, then try using custom embeddings.
The job trains a few (configurable) RNN layers on top of word embeddings or fine-tunes a BERT model on the provided training data. The result model uses an attention mechanism to average word embeddings to obtain the final single dense vector for the content.
Encoder output dim size:
line. You might need this information when creating collections in Milvus.random_*
dynamic field defined in its managed-schema.xml
. This field is required for sampling the data. If it is not present, add the following entry to the managed-schema.xml
alongside other dynamic fields <dynamicField name="random_*" type="random"/>
and <fieldType class=“solr.RandomSortField” indexed=“true” name=“random”/> alongside other field types.Configure the Smart Answers Pipelines (5.3 and later)
Milvus
collection.smart-answers
index pipeline.smart-answers
query pipeline.Smart Answers Pre-trained Coldstart
models outputs vectors of 512 dimension size. Dimensionality of encoders trained by Smart Answers Supervised Training
job depends on the provided parameters and printed in the training job logs.Create Collections in Milvus
job can be used to create multiple collections at once. In this image, the first collection is used in the indexing and query steps. The other two collections are used in the example.
Field to Encode
to the document field name to be processed and encoded into dense vectors.
Encoder Output Vector
matches the output vector from the chosen model.
Milvus Collection Name
matches the collection name created via the Create Milvus Collection
job.
Fail on Error
in the Encode into Milvus
stage and Apply the changes. This will cause an error message to display if any settings need to be changed.Encoder Output Vector
matches the output vector from the chosen model.
Milvus Collection Name
matches the collection name created via the Create Milvus Collection
job.
Milvus Results Context Key
can be changed as needed. It will be used in the Milvus Ensemble Query Stage to calculate the query score.
Ensemble math expression
as needed based on your model and the name used in the prior stage for the storing the Milvus results.
In versions 5.4 and later, you can also set the Threshold
so that the Milvus Ensemble Query Stage will only return items with a score greater than or equal to the configured value.
smart-answers
index and query pipelines with a few additional changes.Prior to configuring the Smart Answers pipelines, use the Create Milvus Collection
job to create two collections, question_collection
and answer_collection
, to store the encoded “questions” and the encoded “answers”, respectively.Field to Encode
to be title_t
and change the Milvus Collection Name
to match the new Milvus collection, question_collection
.In the Encode Answer stage, specify Field to Encode
to be description_t
and change the Milvus Collection Name
to match the new Milvus collection, answer_collection
.Milvus Results Context Key
needs to be different in each of these two stages.In the Query Questions stage, we set the Milvus Results Context Key
to milvus_questions
and the Milvus collection name to question_collection
.Query Questions (Milvus Query) stage:Milvus Results Context Key
to milvus_answers
and the Milvus collection name to answer_collection
.Query Answers (Milvus Query) stage:Ensemble math expression
combining the results from the two query stages. If we want the question scores and answer scores weighted equally, we would use: 0.5 * milvus_questions + 0.5 * milvus_answers
.
This is recommended especially when you have limited FAQ dataset and want to utilize both question and answer information.Milvus Ensemble Query stage”smart-answers” index pipeline | ![]() | Encode into Milvus stage |
”smart-answers” query pipeline | ![]() |
Job ID
. A unique identifier for the job.Collection Name
. A name for the Milvus collection you are creating. This name is used in both the Smart Answer Index and the Smart Answer Query pipelines.Dimension
. The dimension size of the vectors to store in this Milvus collection. The Dimension should match the size of the vectors returned by the encryption model. For example, if the model was created with either the Smart Answers Coldstart Training
job or the Smart Answers Supervised Training
job with the Model Base word_en_300d_2M
, then the dimension would be 300.Index file size
. Files with more documents than this will cause Milvus to build an index on this collection.Metric
. The type of metric used to calculate vector similarity scores. Inner Product
is recommended. It produces values between 0 and 1, where a higher value means higher similarity.Field to Encode
and store it in Milvus in the given Milvus collection.
There are several required parameters:Model ID
. The ID of the model.Encoder Output Vector
. The name of the field that stores the compressed dense vectors output from the model. Default value: vector
.Field to Encode
. The text field to encode into a dense vector, such as answer_t
or body_t
.Milvus Collection Name
. The name of the collection you created via the Create Milvus Collection job, which will store the dense vectors. When creating the collection you specify the type of Metric to use to calculate vector similarity.
This stage can be used multiple times to encode additional fields, each into a different Milvus collection.Model ID
. The ID of the model used when configuring the model training job.Encoder Output Vector
. The name of the output vector from the specified model, which will contain the query encoded as a vector. Defaults to vector.Milvus Collection Name
. The name of the collection that you used in the Encode into Milvus
index stage to store the encoded vectors.Milvus Results Context Key
. The name of the variable used to store the vector distances. It can be changed as needed. It will be used in the Milvus Ensemble Query Stage to calculate the query score for the document.Number of Results
. The number of highest scoring results returned from Milvus.
This stage would typically be used the same number of times that the Encode into Milvus
index stage is used, each with a different Milvus collection and a different Milvus Results Context Key
.ensemble score
, which is used to return the best matches.Ensemble math expression
. The mathematical expression used to calculate the ensemble score
. It should reference the value(s) variable name specified in the Milvus Results Context Key
parameter in the Milvus Query stage.Result field name
. The name of the field used to store the ensemble score
. It defaults to ensemble_score
.Threshold
- A parameter that filters the stage results to remove items that fall below the configured score. Items with a score at, or above, the threshold will be returned.ensemble_score
, into each of the returned documents, which is particularly useful when there is more than one Milvus Query Stage
. This stage needs to come after the Solr Query
stage.Train a Smart Answers cold start model
word_custom
or bpe_custom
.
This trains Word2vec on the data and fields specified in Training collection and Field which contains the content documents. It might be useful in cases when your content includes unusual or domain-specific vocabulary.
When you use the pre-trained embeddings, the log shows the percentage of processed vocabulary words. If this value is high, then try using custom embeddings.
During the training job analyzes the content data to select weights for each of the words. The result model performs the weighted average of word embeddings to obtain final single dense vector for the content.
random_*
dynamic field defined in its managed-schema.xml
. This field is required for sampling the data. If it is not present, add the following entry to the managed-schema.xml
alongside other dynamic fields <dynamicField name="random_*" type="random"/>
and <fieldType class=“solr.RandomSortField” indexed=“true” name=“random”/> alongside other field types.Set Up a Pre-Trained Cold Start Model for Smart Answers
qna-coldstart-large
- this is a large model trained on variety of corpuses and tasks.qna-coldstart-multilingual
- covers 16 languages. List of supported languages: Arabic, Chinese-simplified, Chinese-traditional, English, French, German, Italian, Japanese, Korean, Dutch, Polish, Portuguese, Spanish, Thai, Turkish, Russian.deploy-qna-coldstart-multilingual
or deploy-qna-coldstart-large
.qna-coldstart-multilingual
qna-coldstart-large
lucidworks
.qna-coldstart-multilingual:v1.1
qna-coldstart-large:v1.1
qna-coldstart-multilingual:[vector]
[vector, compressed_vector]