The LWAI Prediction API is used to send synchronous API calls that run predictions from pre-trained models or custom models.The LWAI Prediction API supports models hosted by Lucidworks and specific third-party models. The Lucidworks AI Use Case API returns a list of all supported models. For more information about supported models, see Generative AI models.You can enter the values returned in the Lucidworks AI Use Case API for the USE_CASE and MODEL_ID fields in the /prediction use case requests.The generic path for the Prediction API is /ai/prediction/USE_CASE/MODEL_ID.
For detailed API specifications in Swagger/OpenAPI format, see Platform APIs.
Some parameters in the /ai/async-prediction/USE_CASE/MODEL_ID request are common to all of the generative AI (Gen-AI) use cases, such as the modelConfig parameter.
Also referred to as hyperparameters, these fields set certain controls on the response.
Refer to the API spec for more information.
The /ai/prediction/USE_CASE/MODEL_ID request returns predictions for pre-trained or custom models in the specified use case format for the modelId in the request.
Unique fields and values in the request are described in each use case.
The Lucidworks AI (LWAI) Prediction API supports streaming responses. Streaming enables clients to receive model outputs incrementally as they are generated, improving responsiveness and interactivity for applications such as chat interfaces or live content generation.When using the synchronous LWAI Prediction API, you can enable streaming by including the appropriate request header. This allows the model’s output to be sent as a stream of Server-Sent Events (SSE) instead of a single, complete JSON payload.
Fusion does not currently support streaming responses.When calling LWAI through Fusion services or pipelines, responses are returned only after the full prediction is generated, even if the underlying model supports streaming.Streaming is only available when you call the LWAI Prediction API directly.
Some VPNs may interfere with streaming, buffering the events all internally and sending them together as a full streamed response, rather than passing on each event as it comes.
To enable streaming, include the following header in your request: Accept: text/event-stream.Example Request:
Copy
curl --request POST \ --url 'https://APPLICATION_ID.applications.lucidworks.com/ai/prediction/passthrough/MODEL_ID' \ --header 'Accept: text/event-stream' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ACCESS_TOKEN' \ --data '{ "batch": [ { "text": "Without explanation, please give me an example pangram. PANGRAM:" } ], "useCaseConfig": { "dataType": "text" }, "modelConfig": { "temperature": 0.3, "maxTokens": 50 } }'
Example Streaming Response:Each line of the response stream begins with data: followed by a JSON payload including a small snippet of generation that will include a type of text_delta or completed.
When the model completes, a final response.completed event is sent:
Copy
data:{"requestId":"...","predictions":[{"tokensUsed":{"promptTokens":65,"completionTokens":11,"totalTokens":76},"response":"\"The quick brown fox jumps over the lazy dog.\""}],"type":"response.completed"}
Full Response Stream:
Copy
data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":0,"output":"\"The"},"type":"response.text_delta"}data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":1,"output":" quick"},"type":"response.text_delta"}data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":2,"output":" brown"},"type":"response.text_delta"}data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":3,"output":" fox"},"type":"response.text_delta"}data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":4,"output":" jumps"},"type":"response.text_delta"}data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":5,"output":" over"},"type":"response.text_delta"}data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":6,"output":" the"},"type":"response.text_delta"}data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":7,"output":" lazy"},"type":"response.text_delta"}data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":8,"output":" dog"},"type":"response.text_delta"}data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","delta":{"batch":0,"index":9,"output":".\""},"type":"response.text_delta"}data:{"requestId":"f0a156a7-4a65-9bcb-9617-8633adf19a5a-0-099d531c-0d6b-4834-b605-d454352bcd0e","predictions":[{"tokensUsed":{"promptTokens":65,"completionTokens":11,"totalTokens":76},"response":"\"The quick brown fox jumps over the lazy dog.\""}],"type":"response.completed"}
Example POST requests for Prediction API use cases
The topic for every use case contains detailed information about prerequisites and parameters along with example requests and responses. This section provides an overview of Prediction API requests.
When submitting a POST request for a generic use case, use the following format:https://{APPLICATION_ID}.applications.lucidworks.com/ai/prediction/{USE_CASE}/{MODEL_ID}This example uses APPLICATION_ID of b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e. Replace the placeholder for ACCESS_TOKEN with the token generated in the Authentication API response.
Copy
curl --request POST \ --url https://b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e.applications.lucidworks.com/ai/prediction/{USE_CASE}/{MODEL_ID} \ --header 'Accept: application/json' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer {ACCESS_TOKEN}' \ --data '{ "batch": [ "text": "Content for the model to analyze." ]}'
Custom embedding use case Prediction API POST request
When submitting a POST request for a custom embedding use case, use the following format:https://{APPLICATION_ID}.applications.lucidworks.com/ai/prediction/embedding/{DEPLOYMENT_ID}This use case request requires APPLICATION_ID and DEPLOYMENT_ID.The DEPLOYMENT_ID is generated when the custom embedding model is deployed. For information, see Deployment details.The custom MODEL_ID can also be obtained using the API as described in the following topics:
This example uses APPLICATION_ID of b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e and a DEPLOYMENT_ID of 4f10a8a7-52a4-440d-a015-70d00483ac5e. Replace the placeholder for ACCESS_TOKEN with the token generated in the Authentication API response.
Copy
curl --request POST \ --url https://b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e.applications.lucidworks.com/ai/prediction/embedding/4f10a8a7-52a4-440d-a015-70d00483ac5e \ --header 'Accept: application/json' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer {ACCESS_TOKEN}' \ --data '{ "batch": [ "text": "Content for the model to vectorize." ]}'
Pre-trained embedding use case Prediction API POST request
When submitting a POST request for a pre-trained embedding use case, use the following format:https://{APPLICATION_ID}.applications.lucidworks.com/ai/prediction/embedding/{MODEL_ID}This use case request requires APPLICATION_ID and MODEL_ID.The pre-trained MODEL_ID can also be obtained using the API as described in the following topics:
This example uses APPLICATION_ID of b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e and a MODEL_ID of gte-small. Replace the placeholder for ACCESS_TOKEN with the token generated in the Authentication API response.
Copy
curl --request POST \ --url https://b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e.applications.lucidworks.com/ai/prediction/embedding/gte-small \ --header 'Accept: application/json' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer {ACCESS_TOKEN}' \ --data '{ "batch": [ "text": "Content for the model to vectorize." ]}'
Generative AI use case Prediction API POST request
When submitting a POST request for a generative AI use case, use the following format:https://{APPLICATION_ID}.applications.lucidworks.com/ai/prediction/{USE_CASE}/{MODEL_ID}This use case request requires APPLICATION_ID and MODEL_ID.For information about Gen-AI use cases and models, see:
This example uses APPLICATION_ID of b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e, a USE_CASE of passthrough and a MODEL_ID of llama-3-8b-instruct. Replace the placeholder for ACCESS_TOKEN with the token generated in the Authentication API response.
Copy
curl --request POST \ --url https://b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e.applications.lucidworks.com/ai/prediction/passthrough/llama-3-8b-instruct \ --header 'Accept: application/json' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer {ACCESS_TOKEN}' \ --data '{ "batch": [ { "text": "You are a helpful utility program instructed to accomplish a word correction task. Provide the most likely suggestion to the user without a preamble or elaboration.\nPOSSIBLE_MISSPELLING: swerdfish" } ],}'