Configure the LWAI Vectorize pipelineLucidworks AI

Table of Contents

Configure the pipeline
Field Mapping
Solr Dynamic Field Name Mapping
LWAI Vectorize Field
Solr Indexer
Order the stages

The LWAI Vectorize pipeline is a default pipeline that contains the required index stages to set up vector search using Lucidworks AI.

This feature is currently only available to clients who have contracted with Lucidworks for features related to Neural Hybrid Search and Lucidworks AI.

This feature is only available in Managed Fusion 5.9.x for versions 5.9.6+.

This pipeline uses the following stages:

Field Mapping
Solr Dynamic Field Name Mapping
LWAI Vectorize Field
Solr Indexer

Configure the pipeline

To add the Lucidworks AI (LWAI) Vectorize index pipeline:

Sign in to Managed Fusion and click Indexing > Index Pipelines.
Select the default LWAI-vectorize pipeline.
Configure the following stages included in the default pipeline.

Field Mapping

The Field Mapping stage customizes mapping of the fields in an index pipeline document to fields in the Solr scheme.

To configure this stage for the index pipeline:

In the Label field, enter a unique identifier for this stage or leave blank to use the default value.
In the Condition field, enter a script that results in true or false, which determines if the stage should process, or leave blank.
Select the Allow System Fields Mapping? checkbox to map system fields in this stage.
In the Field Retention section, enter specific fields to either keep or delete.
In the Field Value Updates section, enter specific fields and then designate the value to either add to the field, or set on the field. When a value is added, any values previously on the field are retained. When a value is set, any values previously on the field are overwritten by the new value entered.
In the Field Translations section, enter specific fields to either move or copy to a different field. When a field is moved, the values from the source field are moved over to the target field and the source field is removed. When a field is copied, the values from the source field are copied over to the target field and the source field is retained.
Select the Unmapped Fields checkbox to specify the operation on the fields not mapped in the previous sections. Select the Keep checkbox to keep all unmapped fields. This is the only option you need to select for the LWAI-vectorize stage.
Click Save.

Solr Dynamic Field Name Mapping

The Solr Dynamic Field Name Mapping stage maps pipeline document fields to Solr dynamic fields.

In the Label field, enter a unique identifier for this stage or leave blank to use the default value.
In the Condition field, enter a script that results in true or false, which determines if the stage should process, or leave blank.
Select the Duplicate Single-Valued Fields as Multi-Valued Fields checkbox to enable indexing of field data into both single-valued and multi-valued Solr fields. For example, if this option is selected, the phone field is indexed into both the phone_s single-valued field and the phone_ss multi-valued field. If this option is not selected, the phone field is indexed into only the phone_s single-valued field.
In the Field Not To Map section, enter the names of the fields that should not be mapped by this stage.
Select the Text Fields Advanced Indexing checkbox to enable indexing of text data that doesn’t exceed a specific maximum length, into both tokenized and non-tokenized fields. For example, if this option is selected, the name text field with a value of John Smith is indexed into both the name_t and name_s fields allowing relevant search using name_t field (by matching to a Smith query) and also proper faceting and sorting using name_s field (using John Smith for sorting or faceting). If this option is not selected, the name text field is indexed into only the name_t text field by default.
In the Max Length for Advanced Indexing of Text Fields field, enter a value used to determine how many characters of the incoming text is indexed. For example, 100.
Click Save.

The LWAI Vectorize stage invokes a Lucidworks AI model to encode a string field to a vector representation. This stage is skipped if the field to encode doesn’t exist or is null on the pipeline document.

In the Label field, enter a unique identifier for this stage.
In the Condition field, enter a script that results in true or false, which determines if the stage should process.
In the Account Name field, select the Lucidworks AI API account name defined in Lucidworks AI Gateway.

If you do not see your account name or you are unsure which one to select, contact the Managed Fusion team at Lucidworks.
In the Model field, select the Lucidworks AI model to use for encoding.

If you do not see your model name or you are unsure which one to select, contact the Managed Fusion team at Lucidworks.

For more information, see:
- Pre-trained embedding models
- Custom embedding model training. To use a custom model, you must obtain the deployment ID from the deployments screen.
In the Source field, enter the name of the string field where the value should be submitted to the model for encoding. If the field is blank or does not exist, this stage is not processed. Template expressions are supported.
In the Destination field, enter the name of the field where the vector value from the model response is saved.
- If a value is entered in this field, the following information added to the document:
  - {Destination Field}_b is the boolean value if the vector has been indexed.
  - {Destination Field} is the vector field.
In the Use Case Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. The useCaseConfig parameter that is common to embedding use cases is dataType, but each use case may have other parameters. The value for the query stage is query.
The Model Configuration section is not currently available.
The Call Asynchronously? check box is not currently available.
Select the Fail on Error checkbox to generate an exception if an error occurs while generating a prediction for a document.
Click Save.
Index data using the new pipeline. Verify the vector field is indexed by confirming the field is present in documents.

Solr Indexer

The Solr Indexer stage transforms a Managed Fusion pipeline document into a Solr document, and sends it to Solr for indexing into a collection.

To configure this stage for the index pipeline:

In the Label field, enter a unique identifier for this stage or leave blank to use the default value.
In the Condition field, enter a script that results in true or false, which determines if the stage should process, or leave blank.
Select the Map to Solr Schema checkbox to select and add static and dynamic fields to map in this stage.
Select the Add a field listing all document fields checkbox to add the _lw_fields_ss multi-valued field to the document, which lists all fields that are being sent to Solr.
In the Additional Date Formats section, enter date formats to include in this stage.
In the Additional Update Request Parameters section, enter the parameter names and values to update the request parameters.
Select the Buffer Documents and Send Them To Solr in Batches checkbox to process the documents in batches for this stage.
In the Buffer Size field, enter the number of documents in a batch before sending the batch to Solr. If no value is specified, the default value for this search cluster is used.
In the Buffer Flush Interval (milliseconds) field, enter the maximum number of milliseconds to hold the batch before sending the batch to Solr. If no value is specified, the default value for this search cluster is used.
Select the Allow expensive request parameters checkbox to allow commit=true and optimize=true to be passed to Solr when specified as request parameters coming into this pipeline. Document commands that specify commit or optimize are still respected even if this checkbox is not selected.
Select the Unmapped Fields Mapping checkbox to specify the information for all of the fields not mapped in the previous sections.
- In the Source Field, enter the name of the unmapped field to be mapped.
- In the Target Field, enter the name of the Solr field to which the unmapped field is mapped.
- In the Operation field, select how the field is mapped. The options are:
  - Add the unmapped field to the Solr field.
  - Copy the unmapped field to the Solr field and retain the value in the Source field.
  - Delete the unmapped field.
  - Keep the unmapped field and do not map it to a Solr field.
  - Move (replace) the Solr field value with the unmapped field Source value and remove the value from the Source field.
  - Set the value of the unmapped field to the value in the Solr field.
Click Save.

Order the stages

For the pipeline to operate correctly, the stages must be in the following order:

Field Mapping
Solr Dynamic Field Name Mapping
LWAI Vectorize Field
Solr Indexer

When you have ordered the stages, click Save.