AI Model Usage

This feature is currently only available to clients who have contracted with Lucidworks for features related to Neural Hybrid Search and Lucidworks AI.

The Lucidworks AI Model Usage screen provides information about model usage for your site. The usage chart displays by use case and the models used, and includes information such as:

The number of search requests consumed during a specific time period. Request types include synchronous indexing and query.
The distribution of requests so you can determine if requests are evenly distributed or if there are spikes in usage.
The models being used, such as e5-small-v2.

To access the AI Model Usage screen, navigate to the megamenu and click Models > AI Model Usage.

The usage chart displays the following information:

Configuration options for the usage chart
Calculated totals
Usage chart graph
Models details

Here is an overview of the selections available in the Lucidworks AI Model Usage screen.

Configuration options

This usage chart displays information based on the fields you select on the screen. If you do not select any specific field, the chart displays values for all of the facets: model providers, request types, and use cases. For example, if you select a specific model provider, the chart only displays values for that provider. If you select to view queries, the chart refreshes and only displays query values, and not synchronous indexing or other values.

Request options

If you select a specific request option, the usage chart displays the values for that option.

The options are:

Total Request Tokens. The total number of individual units of information (tokens) generated by the model. Models tokenize the input differently, so the content of the individual unit that is counted is model-specific. For example, the content of the individual unit may be a word or a phrase. The total number of request tokens is also based on other fields such as duration and facets (model providers, request types, and use cases). For example, you may want to view the total tokens submitted for the past month for the RAG use case. The value displays in the Total Gen AI Tokens In field on the usage chart.
Total Requests. The total number of requests submitted. The values are also based on other fields such as duration and facets (model providers, request types, and use cases). For example, you may want to view the total requests for the past month for the RAG use case. The value displays in the Total Gen AI Requests field on the usage chart. If you select the embedding use case, the value displays in the Total Embedding Requests field.
Total Response Tokens. The total number of individual units of information (tokens) generated by the model. Models tokenize the output differently, so the content of the individual unit that is counted is model-specific. For example, the content of the individual unit may be a word or a phrase. The total number of response tokens is also based on other fields such as duration and facets (model providers, request types, and use cases). For example, you may want to view the total tokens returned for the past week for the RAG use case. The value displays in the Total Gen AI Tokens Out on the usage chart.

Duration options

Duration options include timeframe selections for today, last 24 hours, last week, last month, last 90 days, last quarter, last year, and a custom range. When the range is selected, the beginning and end dates and times are displayed on the screen. For example, you may want to select the total number of tokens returned in responses during a sales campaign the week after Thanksgiving so you can adjust the campaign for the remaining days before Christmas. The value for token responses displays in the Total Gen AI Tokens Out field on the usage chart.

You can select the checkbox for one or more of the facets to display only that information on the usage chart. If none are selected, the information for all of the facets displays.

Model Providers. The name entity that provides the model. For example, Lucidworks provides supported pre-trained embedding models.
Request Types. The kind of request submitted. For example, synchronous indexing or query.
Use Case. The use case used when the request was submitted. For example, embedding or a generative AI (Gen-AI) use case such as RAG.

Calculated totals

The usage chart displays the following totals that reflect the values of the selections on the screen, such as duration (timeframe), model provider, request type, and use cases.

Total Embedding Requests. The total number of embedding requests submitted for the selected type, duration, and facets. For example, if you select a use case other than the embedding use case, this value is zero.
Total Gen AI Requests. The total number of Gen-AI use case requests submitted for the selected type, duration and facets. For example, if you select a Gen-AI use case, a value displays. If you select the embedding use case, the value is zero.
Total Gen AI Tokens In. The total number of Gen-AI use case tokens received for the selected type, duration and facets. For example, if you select a Gen-AI use case, a value displays. If you select the embedding use case, the value is zero.
Total Gen AI Tokens Out. The total number of Gen-AI use case tokens submitted for the selected type, duration, and facets. For example, if you select a Gen-AI use case, a value displays. If you select the embedding use case, the value is zero.

Usage chart graph

The graph is an x-y axis that displays the request type for the timeframe selected.

Vertical axis. The vertical (y) axis is the option selected in the request option field, which can display total request tokens, total requests, or total response tokens.
Horizontal axis. The horizontal (x) axis displays the timeframe selected in the duration option field. For example, if you select the past year, the horizontal displays each month for the past year.

Model selection

The models submitted in the requests during the selected duration are listed under the horizontal axis, and define the color for that use case. All of the models submitted display by default. If you click on a model, it is grayed out and no longer displayed. To view one model at a time, click on each model to gray them out so they no longer display. Then click one of the models to view the values for only that model. Click again to gray out that model and then click on a different model to view it. For example, you may want to view the total response tokens (request option) for the past month (duration option) and then display only the llama-3v2-3b-instruct model selected under the chart.

Models Details

This section lists all of the Gen-AI and shared models in requests for the selected request option and duration option. If you select a different request option or duration option, for example, total requests instead of total response token for the past month instead of the past year, the values for the new selections display. If the model was not included in a request in the duration period, it does not display in the list. For each Gen-AI and shared model, the following values display:

Model Name. The name of the Gen-AI or shared model. For example, gpt-4o.
Total Requests. The total number of requests submitted for that model for the duration (timeframe) selected.
Total Tokens In.. Also referred to as total request tokens, this is the total number of individual units of information, such as words or phrases, submitted in requests for the model for the duration (timeframe) selected.
Total Tokens Out. Also referred to as total response tokens, this is the total number of individual units of information, such as words or phrases, returned in response from requests for the model for the duration (timeframe) selected.
Last Requested On. The date and time the model was last submitted in a request for the duration (timeframe) selected.

Get Started

Lucidworks Platform

Lucidworks AI

Core Settings

Agent Studio

Commerce Studio

Analytics Studio