Develop and deploy a machine learning model with Ray
pip install ray[serve]
.docker run
with a specified port, like 9000, which you can then curl
to confirm functionality in Fusion.
See the testing example below.kubectl edit configmap argo-deploy-ray-model-workflow -n <namespace>
and then find the ray-head
container in the artisanal escaped YAML and change the memory limit.
Exercise caution when editing because it can break the YAML.
Just delete and replace a single character at a time without changing any formatting.
MODEL_DEPLOYMENT
in the command below can be found with kubectl get svc -n NAMESPACE
. It will have the same name as set in the model name in the Create Ray Model Deployment job.
e5-small-v2
model from Hugging Face, but any pre-trained model from https://huggingface.co will work with this tutorial.If you want to use your own model instead, you can do so, but your model must have been trained and then saved though a function similar to the PyTorch’s torch.save(model, PATH)
function.
See Saving and Loading Models in the PyTorch documentation.e5-small-v2
model is as follows:./
.__call__
: This function is non-negotiable.init
: The init
function is where models, tokenizers, vectorizers, and the like should be set to self for invoking.
It is recommended that you include your model’s trained parameters directly into the Docker container rather than reaching out to external storage inside init
.encode
: The encode
function is where the field or query that is passed to the model from Fusion is processed.
Alternatively, you can process it all in the __call__
function, but it is cleaner not to.
The encode
function can handle any text processing needed for the model to accept input invoked in its model.predict()
or equivalent function which gets the expected model result.deployment.py
and the class name is Deployment()
.requirements.txt
file is a list of installs for the Dockerfile
to run to ensure the Docker container has the right resources to run the model.
For the e5-small-v2
model, the requirements are as follows:import
statement in your Python file, it should be included in the requirements file.To populate the requirements, use the following command in the terminal, inside the directory that contains your code:MODEL_NAME.py
, Dockerfile
, and requirements.txt
files, you need to run a few Docker commands.
Run the following commands in order:Parameter | Description |
---|---|
Job ID | A string used by the Fusion API to reference the job after its creation. |
Model name | A name for the deployed model. This is used to generate the deployment name in Ray. It is also the name that you reference as a model-id when making predictions with the ML Service. |
Model min replicas | The minimum number of load-balanced replicas of the model to deploy. |
Model max replicas | The maximum number of load-balanced replicas of the model to deploy. Specify multiple replicas for a higher-volume intake. |
Model CPU limit | The number of CPUs to allocate to a single model replica. |
Model memory limit | The maximum amount of memory to allocate to a single model replica. |
Ray Deployment Import Path | The path to your top-level Ray Serve deployment (or the same path passed to serve run ). For example, deployment:app |
Docker Repository | The public or private repository where the Docker image is located. If you’re using Docker Hub, fill in the Docker Hub username here. |
Image name | The name of the image. For example, e5-small-v2-ray:0.1 . |
Kubernetes secret | If you’re using a private repository, supply the name of the Kubernetes secret used for access. |
Parameter | Description |
Additional parameters. | This section lets you enter parameter name:parameter value options to be injected into the training JSON map at runtime. The values are inserted as they are entered, so you must surround string values with " . This is the sparkConfig field in the configuration file. |
Write Options. | This section lets you enter parameter name:parameter value options to use when writing output to Solr or other sources. This is the writeOptions field in the configuration file. |
Read Options. | This section lets you enter parameter name:parameter value options to use when reading input from Solr or other sources. This is the readOptions field in the configuration file. |
Develop and deploy a machine learning model with Ray
pip install ray[serve]
.docker run
with a specified port, like 9000, which you can then curl
to confirm functionality in Fusion.
See the testing example below.kubectl edit configmap argo-deploy-ray-model-workflow -n <namespace>
and then find the ray-head
container in the artisanal escaped YAML and change the memory limit.
Exercise caution when editing because it can break the YAML.
Just delete and replace a single character at a time without changing any formatting.
MODEL_DEPLOYMENT
in the command below can be found with kubectl get svc -n NAMESPACE
. It will have the same name as set in the model name in the Create Ray Model Deployment job.
e5-small-v2
model from Hugging Face, but any pre-trained model from https://huggingface.co will work with this tutorial.If you want to use your own model instead, you can do so, but your model must have been trained and then saved though a function similar to the PyTorch’s torch.save(model, PATH)
function.
See Saving and Loading Models in the PyTorch documentation.e5-small-v2
model is as follows:./
.__call__
: This function is non-negotiable.init
: The init
function is where models, tokenizers, vectorizers, and the like should be set to self for invoking.
It is recommended that you include your model’s trained parameters directly into the Docker container rather than reaching out to external storage inside init
.encode
: The encode
function is where the field or query that is passed to the model from Fusion is processed.
Alternatively, you can process it all in the __call__
function, but it is cleaner not to.
The encode
function can handle any text processing needed for the model to accept input invoked in its model.predict()
or equivalent function which gets the expected model result.deployment.py
and the class name is Deployment()
.requirements.txt
file is a list of installs for the Dockerfile
to run to ensure the Docker container has the right resources to run the model.
For the e5-small-v2
model, the requirements are as follows:import
statement in your Python file, it should be included in the requirements file.To populate the requirements, use the following command in the terminal, inside the directory that contains your code:MODEL_NAME.py
, Dockerfile
, and requirements.txt
files, you need to run a few Docker commands.
Run the following commands in order:Parameter | Description |
---|---|
Job ID | A string used by the Fusion API to reference the job after its creation. |
Model name | A name for the deployed model. This is used to generate the deployment name in Ray. It is also the name that you reference as a model-id when making predictions with the ML Service. |
Model min replicas | The minimum number of load-balanced replicas of the model to deploy. |
Model max replicas | The maximum number of load-balanced replicas of the model to deploy. Specify multiple replicas for a higher-volume intake. |
Model CPU limit | The number of CPUs to allocate to a single model replica. |
Model memory limit | The maximum amount of memory to allocate to a single model replica. |
Ray Deployment Import Path | The path to your top-level Ray Serve deployment (or the same path passed to serve run ). For example, deployment:app |
Docker Repository | The public or private repository where the Docker image is located. If you’re using Docker Hub, fill in the Docker Hub username here. |
Image name | The name of the image. For example, e5-small-v2-ray:0.1 . |
Kubernetes secret | If you’re using a private repository, supply the name of the Kubernetes secret used for access. |
Parameter | Description |
Additional parameters. | This section lets you enter parameter name:parameter value options to be injected into the training JSON map at runtime. The values are inserted as they are entered, so you must surround string values with " . This is the sparkConfig field in the configuration file. |
Write Options. | This section lets you enter parameter name:parameter value options to use when writing output to Solr or other sources. This is the writeOptions field in the configuration file. |
Read Options. | This section lets you enter parameter name:parameter value options to use when reading input from Solr or other sources. This is the readOptions field in the configuration file. |
lucidworks/fusion-solr
, aligning with other components and simplifying deployment for external environments.
started-by
values for datasource jobs in the job history.default-subject
instead of the actual user.
Managed Fusion now correctly records and displays the initiating user in the job history, restoring accurate audit information for datasource operations.
managed-schema
and managed-schema.xml
files when reading Solr config sets, ensuring backward compatibility with apps created before the move to template-based config sets.
This prevents Schema API failures caused by unhandled exceptions during schema file lookup.
job-config
service.job-config
pods briefly lose connection to ZooKeeper.
lw.rules.target_segment
parameter, ensuring only matching rules are triggered and improving rule targeting and safety.
primary-port-name
labels, even though this did not impact functionality. This fix reduces unnecessary log noise and improves the clarity of your logs.
job-config
as down
In Managed Fusion 5.9.12 through 5.9.13, the job-config
service may be flagged as “down” in the UI even when running normally.
This display issue is fixed in Managed Fusion 5.9.14.
In Managed Fusion versions 5.9.12 through 5.9.13, strict validation in the job-config
service causes “Collection not found” errors when jobs or V2 datasources target Managed Fusion collections that point to differently named Solr collections.
This issue is fixed in Managed Fusion 5.9.14.
As a workaround, use V1 datasources or avoid using REST call jobs on remapped collections.
In some environments, saving large query pipelines while handling high traffic loads can cause the Query service to crash with OOM errors due to thread contention. Managed Fusion 5.9.14 resolves this issue. If you’re impacted and not yet on this version, contact Lucidworks Support for mitigation options.
If a Web V2 connector job is interrupted—such as by scaling down the connector pod—the system may enter a corrupted state.
Even after clearing and recreating the datasource, new jobs may fail with the error The state should never be null
.
This issue is fixed in Fusion 5.9.13.
fusion-spark-3.2.2
image in Fusion 5.9.12 may fail to refresh Kubernetes tokens correctly.
In Managed Fusion 5.9.12 environments, Spark jobs that rely on token-based authentication can fail due to a Fabric8 client bug in the 3.2.2 Spark image. This may impact the stability or execution of long-running jobs.
job-config
service may incorrectly report a DOWN
status via /actuator/health
even when running normally.
When TLS is enabled and ZooKeeper is unavailable for an extended period, the job-config
service may resume normal operation but continue to report DOWN
on the actuator health endpoint, despite readiness and liveness probes reporting UP
.
This issue is fixed in Fusion 5.9.13.
Managed Fusion running 5.9.12 may fail to index with the Webv2 connector (v2.0.1) due to a corrupted job state in the connectors-backend
service.
Affected jobs log the error The state should never be null
, and common remediation steps like deleting the datasource or reinstalling the connector plugin may not resolve the issue.
The issue is fixed in Managed Fusion 5.9.13.
In some Managed Fusion 5.9.12 environments, clicking Save when adding a schedule from the datasource “Run” dialog does not persist the schedule or show an error message, particularly in apps created before the upgrade. As a workaround, use a new app or manually verify that the job configuration was saved. This issue is fixed in Managed Fusion 5.9.13.
ml-model
service. MLeap was deprecated in Managed Fusion 5.2.0 and was no longer used by Managed Fusion.
Component | Version |
---|---|
Solr | fusion-solr 5.9.12 (based on Solr 9.6.1) |
ZooKeeper | 3.9.1 |
Spark | 3.4.1 |
Ingress Controllers | Nginx, Ambassador (Envoy), GKE Ingress Controller |
Ray | ray[serve] 2.42.1 |