Generative AILucidworks AI

Table of Contents

Generative AI models
Generative AI use cases

Lucidworks AI provides a generative AI (GenAI) service that lets you access and use large language models (LLM) for a variety of GenAI use cases.

The GenAI models use one of Lucidworks' pre-defined use cases described in the following APIs:

The APIs specify configuration for:

Use cases are set in the useCaseConfig parameter.
Models are set in the modelConfig parameter.

For information about the stages that integrate Fusion with Lucidworks AI, see:

Lucidworks AI is built with Meta Llama 3.

Generative AI models

Lucidworks-hosted GenAI models

All of the models currently hosted by Lucidworks are open source.

In models hosted by Lucidworks, the data is:

Contained within Lucidworks and is never exposed to third parties.
Passed to the specified model and generates responses to the instance’s requests, but is not retained to train or be used by the model after the initial request.

The supported Lucidworks-hosted models are:

Llama-3.1-8b-instruct (llama-3-8b-instruct)
Llama-3.2-3b-instruct (llama-3v2-3b-instruct)
Mistral-7B-instruct (v0.2) (mistral-7b-instruct)
nu-zero-ner. This model only supports the NER use case.

Lucidworks-hosted models enforce limits for total model context length. The limits may be different for synchronous and asychronous requests. For Lucidworks-hosted models, the limits are:

Model	Sync Length	Async Length
llama-3-8b-instruct	32000	128000
llama-3v2-3b-instruct	32000	64000
mistral	8192	8192
nu-zero-ner	384	384

There is a limit of 2048 max tokens for all Lucidworks-hosted models. You can generate a maximum of 2048 tokens on output, or you can set a lower maximum limit.

For information about model training, biases, and safety features for those models, refer to the documentation provided by the model creators.

OpenAI models

An API key is required in each OpenAI model request. There is no default key.

The supported OpenAI models are:

gpt-4
gpt-4o
gpt-4o-2024-05-13
gpt-4-0613
gpt-4-turbo
gpt-4-turbo-2024-04-09
gpt-4-turbo-preview
gpt-4-1106-preview
gpt-3.5-turbo
gpt-3.5-turbo-1106
gpt-3.5-turbo-0125

Azure OpenAI models

Deployed Azure OpenAI models are supported in the LWAI Prediction API and the Lucidworks AI Async Prediction API in the following use cases:

Pass-through
Retrieval Augmented Generation (RAG)
Standalone query rewriter
Summarization
Keyword extraction
Named Entity Recognition (NER)

There are some prerequisites to use Azure OpenAI models on Lucidworks AI:

A valid Azure subscription on Microsoft Azure.
Deployed Azure OpenAI models you want to use. Lucidworks does not support Azure AI Studio.
The Azure Deployment Name for the model you want to use. Use this as the value of the Lucidworks AI API "modelConfig": "azureDeployment" field.
The Azure Key1 or Key2 for the model you want to use. Use either as the value of the Lucidworks AI API "modelConfig": "apiKey" field.
The Azure Endpoint for the model you want to use. Use this as the value of the Lucidworks AI API "modelConfig": "azureEndpoint" field.
The Lucidworks AI API value of MODEL_ID for Azure OpenAI is azure-openai.

Google Vertex AI models

Lucidworks AI supports these Google Vertex AI models:

gemini-2.5-pro (based on gemini-2.5-pro-preview-03-25)
gemini-2.5-flash (based on gemini-2.5-flash-preview-04-17)
gemini-2.0-flash
gemini-2.0-flash-lite

Each request requires an apiKey, googleProjectId, and googleRegion. There are no defaults for any of these fields.

The value for apiKey is a base64-encoded Google Vertex AI service account key. To learn how to create it, see Create a Google service account key.

Anthropic models

An API key is required in each OpenAI model request. There is no default key.

The supported Anthropic models are:

claude-sonnet-4-20250514
claude-3-7-sonnet-20250219
claude-3-5-sonnet-20241022
claude-3-5-haiku-20241022

Most of the typical parameters are supported. However, the tooling part of the models or streaming is not currently supported.

To set a system prompt in the pass-through, use the JSON prompt dataType example. This method automatically passes the prompt correctly.

To set multiple system prompts, you must add all of the values in a single text string. If all of the values are not in a single string, only the last system prompt is sent in the query. This scenario supports all use cases.

Generative AI use cases

The GenAI use cases are used to run predictions from pre-trained models.

The Prediction API also contains the embedding use case (that is not categorized as GenAI use cases).

For more information about models, see:

Pre-trained models for the LWAI Prediction API.
Custom models for either the LWAI Prediction API or the Lucidworks AI Async Prediction API.

The generic path for the Prediction API is /ai/prediction/USE_CASE/MODEL_NAME.

The generic path for the Async Prediction API is /ai/async-prediction/USE_CASE/MODEL_NAME.

The GenAI use cases based on the generic path are as follows:

Pass-through use case lets you use the Generative AI services as a proxy to the large language model (LLM). Use this use case when you want full control over the prompt sent to the GenAI model.
Retrieval augmented generation (RAG) use case uses candidate documents inserted into a LLM’s context to ground the generated response to those documents to prevent frequency of LLM hallucinative responses.
Standalone query rewriter use case rewrites the text in relation to information associated with the memoryUuid. This use case can be invoked during the RAG use case.
Summarization use case where the LLM ingests text and returns a summary of that text as a response.
Keyword extraction use case where the LLM ingests text and returns a JSON response that lists keywords extracted from that text.
Named Entity Recognition ner use case where the LLM ingests text and entities to extract and return a JSON response that contains a list of entities extracted from the text.