Seldon
Seldon is an open-source platform used to deploy, scale, and monitor machine learning (ML) models in Kubernetes environments. It is integrated into Fusion to enable model-serving capabilities for inference workflows. For more information, see Seldon Core documentation for more details.
Seldon models and endpoints
Fusion Seldon models are trained on different lengths of text such as short and long sentences or phrases, as well as varied collections for B2B, B2C, and knowledge management entities.
Models are exposed via a unique endpoint, and are used in a pipeline stage or external application. Endpoints can be secured using Istio or other methods through the Fusion gateway.
Supported models include general purpose sentiment prediction, biomedical text collections, and large corpuses in a multiple languages. Based on the model, Seldon supports input formats such as ndarray
, tensor
, and json
. Outputs are in a structured JSON format, typically wrapped in metadata, which supports monitoring and logging functions.
Seldon configuration and management
This section provides links to more detailed information to configure and manage Seldon deployments for Fusion users.
-
Create Seldon Core Model deployment in a Fusion cluster
-
Configure Ray/Seldon vector search in Neural Hybrid Search
-
Ray/Seldon Vectorize Query stage to generate a vector based on a query string
-
Ray/Seldon Vectorize Field stage to invoke a machine learning model that encodes a string field to a vector representation
Update a Seldon model deployment
Seldon does not support in-place updates to certain deployment properties, for example, protocol or runtime class. To apply significant updates:
-
Deploy a new version of the model using a different deployment name.
-
Execute test requests to validate the new endpoint.
-
Update the Fusion pipeline with the new endpoint information.
-
Delete the previous Seldon deployment when the migration is complete.