Custom model training user interfaceLucidworks AI Custom model training
This feature is currently only available to clients who have contracted with Lucidworks for features related to Neural Hybrid Search and Lucidworks AI. |
The Lucidworks AI custom model training user interface lets you train and deploy custom models, and provides information about the custom models deployed on your site.
For technical information about custom embedding models, see Custom embedding model training.
The other embedding models you can use are the pre-trained embedding models that Lucidworks deploys for every organization. For more information, see Lucidworks AI Pre-trained embedding models.
When you click the Custom Models tab, the Lucidworks AI Custom Models screen displays a list of deployed custom models.
The following table describes the information for each model.
Field | Description |
---|---|
Name |
The name of the model. |
Type |
For custom models, it is the name of the model. |
Training status |
Also referred to as
|
Vector size |
The number of elements and objects in the custom model. |
Deployed regions |
The geographic region specified when the custom model is deployed. |
Training started |
The date and time the training started. |
Training completed |
The date and time the training completed. |
For information about how to use the Custom model training user interface to view, train, and manage custom models, see Manage Lucidworks AI custom models. |
Model Details screen
If you click a model from the list, the Model Details screen displays.
You can view Training details that include Metadata and Summary information about the selected model.
You can also:
-
Click Download Model Data to download the JSON file for the model. You can use the parameters from this model in a different model without rekeying the information.
-
Click Delete if the model can be deleted. If the model cannot be deleted because it is associated with deployments, all the deployments must be deleted first. For example, if the model is associated with two deployments,
2 Active Deployments: Deleting Disabled
displays instead of the Delete button. This example indicates there are two current deployments for the model, and based on the status of those deployments, the option to delete the model is disabled.
For more information, see:
Training Details
The Training Details screen displays metadata and summary information.
Metadata
This metadata provides the data supplied when the model was created.
Field | Description |
---|---|
id |
The unique identifier for the model. The identifier of the model. For custom models, the value is the universally unique identified (UUID) that is the primary key for the model. |
Author |
Also referred to as |
ModelType |
The model type. For custom models, it is the name of the model. |
Region |
The geographic region specified when the model was trained. |
State |
Also referred to as
|
Vector size |
The number of elements and objects in the custom model. Default value is 256. |
Training started |
The date and time the training started. |
Training completed |
The date and time the training completed. |
Training Data Catalog |
The location of the catalog of the training data in Google Cloud Storage (GCS). |
Training Data Signals |
The location of signals in the training data in Google Cloud Storage (GCS). |
Error Message |
The errors generated when the training failed. This field only displays for custom models with a TRAINING_FAILED status. |
dataset_config |
The options for the dataset format used for training are:
|
trainer_config |
The options for the trainer type used for training are:
|
trainer_config/text_processor_config |
This field determines which type of tokenization and embedding is used as the base for the recurrent neural network (RNN) model. This field only displays for custom models with a TRAINING_FAILED status. |
trainer_config.encoder_config.rnn_names_list |
This field determines which bi-directional recurrent neural network (RNN) layers are used. Options include |
trainer_config.num_epochs |
The number of epochs the training data must complete. An epoch is a full cycle where training data passes through the designated algorithms. During one epoch, the model processes all the training data examples (queries and index documents) at least one time. This field only applies to models that have successfully been trained. |
Additional config parameters |
Any additional fields used to train the model using the Manual Entry method are listed. They are custom parameters. |
An example of an error message when the training fails is:
Summary
The summary provides information about training metrics for the selected model.
Field | Description |
---|---|
Best Epoch |
The number of the epoch where the most relevant results were returned. |
Index Size |
The number of bytes in the vector index. |
Vector size |
The number of elements and objects in the custom model. |
Training Time |
The number of seconds to successfully train the model. |
Num Trn Queries |
The number of unique training queries used in this model. |
Num Val Queries |
The number of unique validation queries used in this model. |
Num Unique Training Pairs |
The number of unique training pairs for this model. An example of a training pair is |
Metrics
Metrics about the trained model give insights that help you determine if parameters need to be changed or if more data is needed to improve the model for optimal results.
The Custom configuration parameter that specifies metrics is dataset_config.monitor_metric
. When you select one of the values, the k
designates the numbers 1
, 3
, 5
, and 10.
-
hit@k
which measures the probability that the prediction is in the first topk
model predictions. -
map@k
is the mean average precision metric that evaluates the system to return relevant items in the topk
results, and positions more relevant items at the top. -
mrr@k
is the mean reciprocal rank that determines how quickly the system displays the first relevant item in the topk
results. -
ndcg@k
is the normalized discounted cumulative gain metric that compares rankings to the optimal order where all relevant items display at the top. -
recall@k
displays the number of relevant items returned in the topk
recommendations out of the number of relevant items in the dataset.
The metrics section displays graphs and lets you select the value to view:
The following is an example when the hit@k
value is selected. Four graphs display on the screen: hit@1
(pictured), hit@3
, hit@5
, and hit@10
.
Deployments screen
The deployments screen displays information about each time you deployed the selected model.
Deployment history
There is a separate section for each unique deployment of the model, specified by the id
.
Field | Description |
---|---|
id |
The unique identifier for the model. For custom models, the value is the universally unique identified (UUID) that is the primary key for the model. |
Region |
The geographic region specified when the model was deployed. |
State |
The current status of the deployed model. Options are: This field specifies the current status of the custom model deployment. Value options include:
|
Deployed At |
The date and time the deployment occurred. |
Last Used |
The last date and time this model was used in the |
Minimum Replicas |
The minimum value of replicas for the model. |
Maximum Replicas |
The maximum value of replicas for the model. |
parameter_1 |
The value of the first parameter passed in the |
parameter_2 |
The value of the second parameter passed in the |
You can also:
-
Click + New Deployment to deploy the model again.
-
Click the Trash icon to delete a deployment with a status of "Deployed". You cannot delete a deployment with a status of "Deploying".
For more information, see: