Product Selector

Fusion 5.9
    Fusion 5.9

    Prediction APILucidworks AI

    The LWAI Prediction API is used to send synchronous API calls that run predictions from pre-trained models or custom models.

    Lucidworks deployed the mistral-7b-instruct and llama-3-8b-instruct models. The Lucidworks AI Use Case API returns a list of all supported models. For more information about supported models, see Generative AI models.

    You can enter the values returned in the Lucidworks AI Use Case API for the USE_CASE and MODEL_ID fields in the /prediction use case requests.

    The generic path for the Prediction API is /ai/prediction/USE_CASE/MODEL_ID.

    Prerequisites

    To use this API, you need:

    • The unique APPLICATION_ID for your Lucidworks AI application. For more information, see credentials to use APIs.

    • A bearer token generated with a scope value of machinelearning.predict. For more information, see Authentication API.

    • Other required fields specified in each individual use case.

    Common parameters and fields

    modelConfig

    Some parameters of the /ai/prediction/USE_CASE/MODEL_ID request are common to all of the generative AI (GenAI) use cases, including the modelConfig parameter. If you do not enter values, the following defaults are used.

    "modelConfig":{
      "temperature": 0.7,
      "topP": 1.0,
      "presencePenalty": 0.0,
      "frequencyPenalty": 0.0,
      "maxTokens": 256
    }

    Also referred to as hyperparameters, these fields set certain controls on the response of a LLM:

    Field Description

    temperature

    A sampling temperature between 0 and 2. A higher sampling temperature such as 0.8, results in more random (creative) output. A lower value such as 0.2 results in more focused (conservative) output. A lower value does not guarantee the model returns the same response for the same input.

    topP

    A floating-point number between 0 and 1 that controls the cumulative probability of the top tokens to consider, known as the randomness of the LLM’s response. This parameter is also referred to as top probability. Set topP to 1 to consider all tokens. A higher value specifies a higher probability threshold and selects tokens whose cumulative probability is greater than the threshold. The higher the value, the more diverse the output.

    presencePenalty

    A floating-point number between -2.0 and 2.0 that penalizes new tokens based on whether they have already appeared in the text. This increases the model’s use of diverse tokens. A value greater than zero (0) encourages the model to use new tokens. A value less than zero (0) encourages the model to repeat existing tokens.

    frequencyPenalty

    A floating-point number between -2.0 and 2.0 that penalizes new tokens based on their frequency in the generated text. A value greater than zero (0) encourages the model to use new tokens. A value less than zero (0) encourages the model to repeat existing tokens.

    maxTokens

    The maximum number of tokens to generate per output sequence. The value is different for each model. Review individual model specifications when the value exceeds 2048.

    apiKey

    The optional parameter is only required when the specified model is used for prediction. This secret value is specified in the external model. For:

    • OpenAI models, "apiKey" is the value in the model’s "[OPENAI_API_KEY]" field. For more information, see Authentication API keys.

    • Azure OpenAI models, "apiKey" is the value generated by Azure in either the model’s "[KEY1 or KEY2]" field. For requirements to use Azure models, see Generative AI models.

    • Google VertexAI models, "apiKey" is the value in the model’s

      "[BASE64_ENCODED_GOOGLE_SERVICE_ACCOUNT_KEY]" field. For more information, see Create and delete Google service account keys.

    The parameter (for OpenAI, Azure OpenAI, or Google VertexAI models) is only available for the following use cases:

    • Pass-through

    • RAG

    • Standalone query rewriter

    • Summarization

    • Keyword extraction

    • NER

    azureDeployment

    The optional "azureDeployment": "[DEPLOYMENT_NAME]" parameter is the deployment name of the Azure OpenAI model and is only required when a deployed Azure OpenAI model is used for prediction.

    azureEndpoint

    The optional "azureEndpoint": "[ENDPOINT]" parameter is the URL endpoint of the deployed Azure OpenAI model and is only required when a deployed Azure OpenAI model is used for prediction.

    googleProjectId

    The optional "googleProjectId": "[GOOGLE_PROJECT_ID]" parameter is only required when a Google VertexAI model is used for prediction.

    googleRegion

    The optional "googleRegion": "[GOOGLE_PROJECT_REGION_OF_MODEL_ACCESS]" parameter is only required when a Google VertexAI model is used for prediction. The possible region values are:

    • us-central1

    • us-west4

    • northamerica-northeast1

    • us-east4

    • us-west1

    • asia-northeast3

    • asia-southeast1

    • asia-northeast

    Prediction use case by modelId

    The /ai/prediction/USE_CASE/MODEL_ID request returns predictions for pre-trained or custom models in the specified use case format for the modelId in the request.

    Unique fields and values in the request are described in each use case.

    Example request

    curl --request POST \
      --url https://APPLICATION_ID.applications.lucidworks.com/ai/prediction/USE_CASE/MODEL_ID \
      --header 'Accept: application/json' \
      --header 'Content-Type: application/json' \
      --data '{
      "batch": [
        {
          "text": "Content for the model to analyze."
        }
      ],
      "modelConfig": [
        {
          "temperature": 0.8,
          "topP": 1,
          "presencePenalty": 2,
          "frequencyPenalty": 1,
          "maxTokens": 1
        }
      ]
    }'

    The response varies based on the specific use case and the fields included in the request.