Prediction APILucidworks AI
The LWAI Prediction API is used to send synchronous API calls that run predictions from pre-trained models or custom models.
The LWAI Prediction API supports models hosted by Lucidworks and specific third-party models. The Lucidworks AI Use Case API returns a list of all supported models. For more information about supported models, see Generative AI models.
You can enter the values returned in the Lucidworks AI Use Case API for the USE_CASE
and MODEL_ID
fields in the /prediction
use case requests.
The generic path for the Prediction API is /ai/prediction/USE_CASE/MODEL_ID
.
Prerequisites
To use this API, you need:
-
The unique
APPLICATION_ID
for your Lucidworks AI application. For more information, see credentials to use APIs. -
A bearer token generated with a scope value of
machinelearning.predict
. For more information, see Authentication API. -
Other required fields specified in each individual use case.
Common parameters and fields
modelConfig
Some parameters of the /ai/prediction/USE_CASE/MODEL_ID
request are common to all of the generative AI (GenAI) use cases, including the modelConfig
parameter. If you do not enter values, the following defaults are used.
"modelConfig":{
"temperature": 0.7,
"topP": 1.0,
"topK": -1.0,
"maxTokens": 256
}
Also referred to as hyperparameters, these fields set certain controls on the response of a LLM:
Field | Description |
---|---|
temperature |
A sampling temperature between 0 and 2. A higher sampling temperature such as 0.8, results in more random (creative) output. A lower value such as 0.2 results in more focused (conservative) output. A lower value does not guarantee the model returns the same response for the same input. |
topP |
A floating-point number between 0 and 1 that controls the cumulative probability of the top tokens to consider, known as the randomness of the LLM’s response. This parameter is also referred to as top probability. Set |
topK |
An integer that controls the number of top tokens to consider. Set |
presencePenalty |
A floating-point number between -2.0 and 2.0 that penalizes new tokens based on whether they have already appeared in the text. This increases the model’s use of diverse tokens. A value greater than zero (0) encourages the model to use new tokens. A value less than zero (0) encourages the model to repeat existing tokens. This is applicable for all OpenAI, Mistral, and Llama models. |
frequencyPenalty |
A floating-point number between -2.0 and 2.0 that penalizes new tokens based on their frequency in the generated text. A value greater than zero (0) encourages the model to use new tokens. A value less than zero (0) encourages the model to repeat existing tokens. This is applicable for all OpenAI, Mistral, and Llama models. |
maxTokens |
The maximum number of tokens to generate per output sequence. The value is different for each model. Review individual model specifications when the value exceeds 2048. |
apiKey |
The optional parameter is only required when the specified model is used for prediction. This secret value is specified in the external model. For:
The parameter (for OpenAI, Azure OpenAI, Anthropic, or Google VertexAI models) is only available for the following use cases:
|
azureDeployment |
The optional |
azureEndpoint |
The optional |
googleProjectId |
The optional |
googleRegion |
The optional
|
Prediction use case by modelId
The /ai/prediction/USE_CASE/MODEL_ID
request returns predictions for pre-trained or custom models in the specified use case format for the modelId
in the request.
Unique fields and values in the request are described in each use case. |
Example request
curl --request POST \
--url https://APPLICATION_ID.applications.lucidworks.com/ai/prediction/USE_CASE/MODEL_ID \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--data '{
"batch": [
{
"text": "Content for the model to analyze."
}
],
"modelConfig": [
{
"temperature": 0.8,
"topP": 1,
"presencePenalty": 2,
"frequencyPenalty": 1,
"maxTokens": 1
}
]
}'
The response varies based on the specific use case and the fields included in the request.
Prediction API use cases
The use cases available in the Lucidworks AI Prediction API are detailed in the following topics:
Example POST requests for Prediction API use cases
The topic for every use case contains detailed information about prerequisites and parameters along with example requests and responses. This section provides an overview of Prediction API requests.
1. Obtain credentials and generate a bearer token for authorization
Complete the following procedures to obtain client credentials and then generate a Basic authorization token to use for the API request.
-
Sign in to Lucidworks Platform as a workspace owner.
-
Select a AI application and then click Integrations > API.
-
Copy the values in the Application ID, Client ID, and Client Secret fields to use in the Prediction API use case requests.
-
Access a Base64 encoding tool and convert your the values in
CLIENT_ID:CLIENT_SECRET
. -
Using the Authentication API, submit a request similar to the following replacing
CLIENT_ID:CLIENT_SECRET
with the value you converted in the Base64 encoding tool.
curl --request POST \
--url 'https://identity.lucidworks.com/oauth2/ausao8uveaPmyhv0v357/v1/token?scope=machinelearning.predict&grant_type=client_credentials' \
--header 'Accept: application/json' \
--header 'Authorization: Basic [CLIENT_ID:CLIENT_SECRET]' \
--header 'Cache-Control: no-cache' \
--header 'Content-Type: application/x-www-form-urlencoded'
The token is returned in a response similar to the following:
{
"token_type": "Bearer",
"expires_in": 3600,
"access_token": "abcdAefgOggeh1d0SeiFbml6Lhl4hlheccd1LXFpM1c0aaZMOGhMh21Dehhbde9ccUdvggegbexigcogUlMbihbgff.abc2ZXggOcasgmp0aSg6gkFULld3aC14d05BX0hZeHdgeHVvXzVaMhZHiGiMUUhOc1hDiUcdZ1lPOF9SV2cgLCcpc3MgOgcodHdeczovL2lkZe50aXd5Lmx1b2lkd29ba3Mub29hL29hdXdoMg9hdXihbzh1dmVhUG15aHbedcM1ibgsgmF1ZCg6gmh0dHBzOg8vbXBpLmx1b2lkd29ba3Mub29hggegaeF0gcoxiza5Ohf0MzcbLCclaHAgOca3Mhk5iDc5izgsgmipZCg6gcBvbXdfc3aedhhccidBd3BLMzU3ggegc2iegcpbgm1hb2hpbmVsZeFbbmluZb5ecmVkaei0gl0sgii1bgg6gcBvbXdfc3aedhhccidBd3BLMzU3ggegb2xpZe50hmFhZSg6gmdpcmVcdCgsgmi1c3dvbeVbSefgOggbZmb4ZGa4ZS1hZDg4Lhf2Meahbmb1ZC05ZDc1MDlciedcbefgff.ASisgd-sKFhX46VfK1e3Saab_ag1zvbu98Oh4dKsSeg2O5xClua8gadkKfMukV_db2bdbC9iP9l-2Dcp4_gi6khUhd3deKOzvFFX_h6K6hlLxdaiaFxvhC-cL-FlmGaOidodmz_sdh9xdl_eg6FZpKadBFf4XflXggp-ib5kCv5-ec8KpiimlhcLbcbPOLeaoaUiViho3lO05efccvbbeagZpPi8z1VblceeCi1gh1k4dgep6uZBePoiVAcf6e2AK8cK_af7Xa2fMKBo81vaLi7fccxHGdaz_CbgZavglZevliBXdMcF5A4amdbUbaammd_Cczdg55K_bAfk-gaLfe",
"scope": "machinelearning.predict"
}
2. Submit the Prediction API POST requests
2.1. Generic use case Prediction API POST request
Using the credentials and access token, complete the following steps to submit a POST request in the following format:
https://{APPLICATION_ID}.applications.lucidworks.com/ai/prediction/{USE_CASE}/{MODEL_ID}
This example uses APPLICATION_ID
of b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e
. Replace the placeholder for ACCESS_TOKEN
with the token generated in the Authentication API response.
curl --request POST \
--url https://b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e.applications.lucidworks.com/ai/prediction/{USE_CASE}/{MODEL_ID} \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data '{
"batch": [
"text": "Content for the model to analyze."
]
}'
2.2. Custom embedding use case Prediction API POST request
Using the credentials and access token, complete the following steps to submit a POST request in the following format:
https://{APPLICATION_ID}.applications.lucidworks.com/ai/prediction/embedding/{DEPLOYMENT_ID}
This use case request requires APPLICATION_ID
and DEPLOYMENT_ID
.
The DEPLOYMENT_ID
is generated when the custom embedding model is deployed. For information, see Deployment details.
The custom MODEL_ID
can also be obtained using the API as described in the following topics:
This example uses APPLICATION_ID
of b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e
and a DEPLOYMENT_ID
of 4f10a8a7-52a4-440d-a015-70d00483ac5e
. Replace the placeholder for ACCESS_TOKEN
with the token generated in the Authentication API response.
curl --request POST \
--url https://b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e.applications.lucidworks.com/ai/prediction/embedding/4f10a8a7-52a4-440d-a015-70d00483ac5e \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data '{
"batch": [
"text": "Content for the model to vectorize."
]
}'
2.3. Pre-trained embedding use case Prediction API POST request
Using the credentials and access token, complete the following steps to submit a POST request in the following format:
https://{APPLICATION_ID}.applications.lucidworks.com/ai/prediction/embedding/{MODEL_ID}
This use case request requires APPLICATION_ID
and MODEL_ID
.
The pre-trained MODEL_ID
can also be obtained using the API as described in the following topics:
This example uses APPLICATION_ID
of b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e
and a MODEL_ID
of gte-small
. Replace the placeholder for ACCESS_TOKEN
with the token generated in the Authentication API response.
curl --request POST \
--url https://b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e.applications.lucidworks.com/ai/prediction/embedding/gte-small \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data '{
"batch": [
"text": "Content for the model to vectorize."
]
}'
2.4. Generative AI use case Prediction API POST request
Using the credentials and access token, complete the following steps to submit a POST request in the following format:
https://{APPLICATION_ID}.applications.lucidworks.com/ai/prediction/{USE_CASE}/{MODEL_ID}
This use case request requires APPLICATION_ID
and MODEL_ID
.
For information about GenAI use cases and models, see:
This example uses APPLICATION_ID
of b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e
, a USE_CASE
of passthrough
and a MODEL_ID
of llama-3-8b-instruct
. Replace the placeholder for ACCESS_TOKEN
with the token generated in the Authentication API response.
curl --request POST \
--url https://b7bcb5a5-4b6a-4fb5-b6bc-9f8cc6ab234e.applications.lucidworks.com/ai/prediction/passthrough/llama-3-8b-instruct \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data '{
"batch": [
{
"text": "You are a helpful utility program instructed to accomplish a word correction task. Provide the most likely suggestion to the user without an preamble or elaboration.\nPOSSIBLE_MISSPELLING: swerdfish"
}
],
}'