Configure the LWAI Prediction index StageLucidworks AI

Table of Contents

Additional requirements

The LWAI Prediction index stage is a Fusion index pipeline stage that enriches your index with Generative AI predictions. It defaults to asynchronous processing, which does not block the pipeline while waiting for a response from Lucidworks AI.

For reference information, see LWAI Prediction index stage. For the LWAI Prediction query stage, see Configure the LWAI Prediction query stage.

To use this stage, non-admin Fusion users must be granted the PUT,POST,GET:/LWAI-ACCOUNT-NAME/** permission in Fusion, which is the Lucidworks AI API Account Name defined in Lucidworks AI Gateway when this stage is configured.

To configure this stage:

Sign in to Fusion and click Indexing > Index Pipelines.
Click Add+ to add a new pipeline.
Enter the name in Pipeline ID.
Click Add a new pipeline stage.
In the AI section, click LWAI Prediction.
In the Label field, enter a unique identifier for this stage.
In the Condition field, enter a script that results in true or false, which determines if the stage should process.
In the Account Name field, select the Lucidworks AI API account name defined in Lucidworks AI Gateway.
In the Use Case field, select the Lucidworks AI use case to associate with this stage.
- To generate a list of the use cases for your organization, see Use Case API.
- If the Call Asynchronously? check box is selected, see available use cases described in Async Prediction API.
- If the Call Asynchronously? check box is not selected, see available use cases described in Prediction API.
In the Model field, select the Lucidworks AI model to associate with this stage.

If you do not see any model names and you are a non-admin Fusion user, verify with a Fusion administrator that your user account has these permissions: PUT,POST,GET:/LWAI-ACCOUNT-NAME/**

Your Fusion account name must match the name of the account that you selected in the Account Name dropdown.

For more information about models, see:
In the Input context variable variable field, enter the name of the variable in context to be used as input. Template expressions are supported.
In the Destination field name and context output field, enter the name that will be used as both the field name in the document where the prediction is written and the context variable that contains the prediction.
- If the Call Asynchronously? check box is selected and a value is entered in this field:
  - {destination name}_t is the full response.
  - In the document:
    
    _lw_ai_properties_ss contains the Lucidworks account, boolean setting for async, use case, input for the call, and the collection.
    
    _lw_ai_request_count is the number of GET requests by predictionId and _lw_ai_success_count is the number of responses without errors. These two fields are used for debugging only. Based on the deployment, the most useful measure is the ratio of _lw_ai_success_count divided by `_lw_ai_request_count and then adjusting as much as possible to achieve 1.0.
    
    enriched_ss contains the use case. This can be used as a boolean value to verify if the use case indexed successfully.
- If the Call Asynchronously? check box is not selected and a value is entered in this field:
  - {destination name}_t is the full response.
- If no value is entered in this field (regardless of the Call Asynchronously? check box setting):
  - lw_ai{use case}_t is the response.response object, which is the raw model output.
  - lw_ai{use case}_response_s is the full response.
In the Use Case Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. The useCaseConfig parameter is only applicable to certain use cases.
- If the Call Asynchronously? check box is selected, useCaseConfig information for each applicable use case is described in Async Prediction API.
- If the Call Asynchronously? check box is not selected, useCaseConfig information for each applicable use case is described in Prediction API.
In the Model Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. Several modelConfig parameters are common to generative AI use cases.
- If the Call Asynchronously? check box is selected, modelConfig information is described in Async Prediction API.
- If the Call Asynchronously? check box is not selected, modelConfig information is described in Prediction API.
In the API Key field, enter the secret value specified in the external model. For:
- OpenAI models, "apiKey" is the value in the model’s "[OPENAI_API_KEY]" field. For more information, see Authentication API keys.
- Azure OpenAI models, "apiKey" is the value generated by Azure in either the model’s "[KEY1 or KEY2]" field. For requirements to use Azure models, see Generative AI models.
- Google VertexAI models, "apiKey" is the value in the model’s
  
  "[BASE64_ENCODED_GOOGLE_SERVICE_ACCOUNT_KEY]" field. For more information, see Create and delete Google service account keys.
To run the API call asynchronously, select the Call Asynchronously? check box to specify the stage is to use the Lucidworks AI Async Prediction API endpoints. If this is selected, the API call does not block the pipeline while waiting for a response from Lucidworks AI.

If the check box is not selected, the API call uses the Prediction API, which uses the pipeline until a response is received from Lucidworks AI. Performance of other API calls can be impacted.
In the Maximum Asynchronous Call Tries field, enter the maximum number of times to send an asynchronous API call before the system generates a failure error.
Select the Fail on Error checkbox to generate an exception if an error occurs while generating a prediction for a document.
Click Save.

Additional requirements

Additional requirements to use async calls include:

Use a V2 connector. Only V2 connectors work for this task and not other options, such as PBL or V1 connectors.
Remove the Apache Tika stage from your parser because it can cause datasource failures with the following error: "The following components failed: [class com.lucidworks.connectors.service.components.job.processor.DefaultDataProcessor : Only Tika Container parser can support Async Parsing.]"
Replace the Solr Indexer stage with the Solr Partial Update Indexer stage with the following settings:
- Enable Concurrency Control set to off
- Reject Update if Solr Document is not Present set to off
- Process All Pipeline Doc Fields set to on
- Allow reserved fields set to on
- A parameter with Update Type, Field Name & Value in Updates