Prediction API

The LWAI Prediction API is used to send synchronous API calls that run predictions from pre-trained models or custom models. The LWAI Prediction API supports models hosted by Lucidworks and specific third-party models. The Lucidworks AI Use Case API returns a list of all supported models. For more information about supported models, see Generative AI models. You can enter the values returned in the Lucidworks AI Use Case API for the USE_CASE and MODEL_ID fields in the /prediction use case requests. The generic path for the Prediction API is /ai/prediction/USE_CASE/MODEL_ID.

For detailed API specifications in Swagger/OpenAPI format, see Platform APIs.

Prerequisites

To use this API, you need:

The unique APPLICATION_ID for your Lucidworks AI application. For more information, see credentials to use APIs.
A bearer token generated with a scope value of machinelearning.predict. For more information, see Authentication API.
Other required fields specified in each individual use case.

Common parameters and fields

Some parameters in the /ai/prediction/USE_CASE/MODEL_ID request are common to all of the generative AI (Gen-AI) use cases, such as the modelConfig parameter. Also referred to as hyperparameters, these fields set certain controls on the response. Refer to the API spec for more information.

Prediction use case by modelId

The /ai/prediction/USE_CASE/MODEL_ID request returns predictions for pre-trained or custom models in the specified use case format for the modelId in the request.

Unique fields and values in the request are described in each use case.

Example request

curl --request POST \
  --url https://APPLICATION_ID.applications.lucidworks.com/ai/prediction/USE_CASE/MODEL_ID \
  --header 'Accept: application/json' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer ACCESS_TOKEN'
  --data '{
  "batch": [
    {
      "text": "Content for the model to analyze."
    }
  ],
  "modelConfig": [
    {
      "temperature": 0.8,
      "topP": 1,
      "presencePenalty": 2,
      "frequencyPenalty": 1,
      "maxTokens": 1
    }
  ]
}'

The response varies based on the specific use case and the fields included in the request.

Prediction API use cases

The use cases available in the Lucidworks AI Prediction API are detailed in the following topics:

Streaming with the LWAI Prediction API

The Lucidworks AI (LWAI) Prediction API supports streaming responses. Streaming enables clients to receive model outputs incrementally as they are generated, improving responsiveness and interactivity for applications such as chat interfaces or live content generation. When using the synchronous LWAI Prediction API, you can enable streaming by including the appropriate request header. This allows the model’s output to be sent as a stream of Server-Sent Events (SSE) instead of a single, complete JSON payload.

Fusion does not currently support streaming responses.When calling LWAI through Fusion services or pipelines, responses are returned only after the full prediction is generated, even if the underlying model supports streaming.Streaming is only available when you call the LWAI Prediction API directly.

Some VPNs may interfere with streaming, buffering the events all internally and sending them together as a full streamed response, rather than passing on each event as it comes.

Unsupported Use Cases

The following use cases do not currently support streaming:

Embedding
Classification
Retrieval-augmented generation (RAG)

Unsupported Models

The following models do not support streaming:

nu-zero-ner
Azure-hosted models

Enabling Streaming

To enable streaming, include the following header in your request: Accept: text/event-stream. Example Request:

curl --request POST \
  --url 'https://APPLICATION_ID.applications.lucidworks.com/ai/prediction/passthrough/MODEL_ID' \
  --header 'Accept: text/event-stream' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer ACCESS_TOKEN' \
  --data '{
    "batch": [
      {
        "text": "Without explanation, please give me an example pangram. PANGRAM:"
      }
    ],
    "useCaseConfig": {
      "dataType": "text"
    },
    "modelConfig": {
      "temperature": 0.3,
      "maxTokens": 50
    }
  }'

Example Streaming Response: Each line of the response stream begins with data: followed by a JSON payload including a small snippet of generation that will include a type of text_delta or completed.

data:{"requestId":"...","delta":{"batch":0,"index":0,"output":"\"The"},"type":"response.text_delta"}

data:{"requestId":"...","delta":{"batch":0,"index":1,"output":" quick"},"type":"response.text_delta"}

data:{"requestId":"...","delta":{"batch":0,"index":2,"output":" brown"},"type":"response.text_delta"}

data:{"requestId":"...","delta":{"batch":0,"index":3,"output":" fox"},"type":"response.text_delta"}

...

When the model completes, a final response.completed event is sent:

data:{"requestId":"...","predictions":[{"tokensUsed":{"promptTokens":65,"completionTokens":11,"totalTokens":76},"response":"\"The quick brown fox jumps over the lazy dog.\""}],"type":"response.completed"}

Full Response Stream:

data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":0,"output":"\"The"},"type":"response.text_delta"}

data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":1,"output":" quick"},"type":"response.text_delta"}

data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":2,"output":" brown"},"type":"response.text_delta"}

data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":3,"output":" fox"},"type":"response.text_delta"}

data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":4,"output":" jumps"},"type":"response.text_delta"}

data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":5,"output":" over"},"type":"response.text_delta"}

data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":6,"output":" the"},"type":"response.text_delta"}

data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":7,"output":" lazy"},"type":"response.text_delta"}

data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":8,"output":" dog"},"type":"response.text_delta"}

data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":9,"output":".\""},"type":"response.text_delta"}

data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","predictions":[{"tokensUsed":{"promptTokens":65,"completionTokens":11,"totalTokens":76},"response":"\"The quick brown fox jumps over the lazy dog.\""}],"type":"response.completed"}

Example POST requests for Prediction API use cases

The topic for every use case contains detailed information about prerequisites and parameters along with example requests and responses. This section provides an overview of Prediction API requests.

Generic use case Prediction API POST request

When submitting a POST request for a generic use case, use the following format: https://{APPLICATION_ID}.applications.lucidworks.com/ai/prediction/{USE_CASE}/{MODEL_ID} This example uses APPLICATION_ID of b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e. Replace the placeholder for ACCESS_TOKEN with the token generated in the Authentication API response.

curl --request POST \
  --url https://b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e.applications.lucidworks.com/ai/prediction/{USE_CASE}/{MODEL_ID} \
  --header 'Accept: application/json' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer {ACCESS_TOKEN}' \
  --data '{
  "batch": [
    "text": "Content for the model to analyze."
  ]
}'

Custom embedding use case Prediction API POST request

When submitting a POST request for a custom embedding use case, use the following format: https://{APPLICATION_ID}.applications.lucidworks.com/ai/prediction/embedding/{DEPLOYMENT_ID} This use case request requires APPLICATION_ID and DEPLOYMENT_ID. The DEPLOYMENT_ID is generated when the custom embedding model is deployed. For information, see Deployment details. The custom MODEL_ID can also be obtained using the API as described in the following topics:

This example uses APPLICATION_ID of b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e and a DEPLOYMENT_ID of 4f10a8a7-52a4-440d-a015-70d00483ac5e. Replace the placeholder for ACCESS_TOKEN with the token generated in the Authentication API response.

curl --request POST \
  --url https://b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e.applications.lucidworks.com/ai/prediction/embedding/4f10a8a7-52a4-440d-a015-70d00483ac5e \
  --header 'Accept: application/json' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer {ACCESS_TOKEN}' \
  --data '{
  "batch": [
    "text": "Content for the model to vectorize."
  ]
}'

Pre-trained embedding use case Prediction API POST request

When submitting a POST request for a pre-trained embedding use case, use the following format: https://{APPLICATION_ID}.applications.lucidworks.com/ai/prediction/embedding/{MODEL_ID} This use case request requires APPLICATION_ID and MODEL_ID. The pre-trained MODEL_ID can also be obtained using the API as described in the following topics:

This example uses APPLICATION_ID of b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e and a MODEL_ID of gte-small. Replace the placeholder for ACCESS_TOKEN with the token generated in the Authentication API response.

curl --request POST \
  --url https://b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e.applications.lucidworks.com/ai/prediction/embedding/gte-small \
  --header 'Accept: application/json' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer {ACCESS_TOKEN}' \
  --data '{
  "batch": [
    "text": "Content for the model to vectorize."
  ]
}'

Generative AI use case Prediction API POST request

When submitting a POST request for a generative AI use case, use the following format: https://{APPLICATION_ID}.applications.lucidworks.com/ai/prediction/{USE_CASE}/{MODEL_ID} This use case request requires APPLICATION_ID and MODEL_ID. For information about Gen-AI use cases and models, see:

This example uses APPLICATION_ID of b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e, a USE_CASE of passthrough and a MODEL_ID of llama-3-8b-instruct. Replace the placeholder for ACCESS_TOKEN with the token generated in the Authentication API response.

curl --request POST \
  --url https://b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e.applications.lucidworks.com/ai/prediction/passthrough/llama-3-8b-instruct \
  --header 'Accept: application/json' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer {ACCESS_TOKEN}' \
  --data '{
  "batch": [
        {
            "text": "You are a helpful utility program instructed to accomplish a word correction task. Provide the most likely suggestion to the user without a preamble or elaboration.\nPOSSIBLE_MISSPELLING: swerdfish"
        }
    ],
}'

Get Started

Lucidworks Platform

Lucidworks AI

Core Settings

Agent Studio

Commerce Studio

Analytics Studio

Prerequisites

Common parameters and fields

Prediction use case by modelId

Example request

Prediction API use cases

Streaming with the LWAI Prediction API

Unsupported Use Cases

Unsupported Models

Enabling Streaming

Example POST requests for Prediction API use cases

Generic use case Prediction API POST request

Custom embedding use case Prediction API POST request

Pre-trained embedding use case Prediction API POST request

Generative AI use case Prediction API POST request

Get Started

Lucidworks Platform

Lucidworks AI

Core Settings

Agent Studio

Commerce Studio

Analytics Studio

​Prerequisites

​Common parameters and fields

​Prediction use case by modelId

​Example request

​Prediction API use cases

​Streaming with the LWAI Prediction API

​Unsupported Use Cases

​Unsupported Models

​Enabling Streaming

​Example POST requests for Prediction API use cases

​Generic use case Prediction API POST request

​Custom embedding use case Prediction API POST request

​Pre-trained embedding use case Prediction API POST request

​Generative AI use case Prediction API POST request

Prerequisites

Common parameters and fields

Prediction use case by modelId

Example request

Prediction API use cases

Streaming with the LWAI Prediction API

Unsupported Use Cases

Unsupported Models

Enabling Streaming

Example POST requests for Prediction API use cases

Generic use case Prediction API POST request

Custom embedding use case Prediction API POST request

Pre-trained embedding use case Prediction API POST request

Generative AI use case Prediction API POST request