Skip to main content
ImportantThis feature is currently only available to clients who have contracted with Lucidworks for features related to Neural Hybrid Search and Lucidworks AI.
This feature is available starting in Fusion 5.9.5 and in all subsequent Fusion 5.9 releases.
LWAI Vectorize Field is a Fusion index pipeline stage that invokes a Lucidworks AI model to encode a string field to a vector representation.
To process documents in batches in Fusion 5.9.15 and later, use the LWAI Batch Vectorize index stage.

Use cases

LWAI Vectorize Field converts text into vectors as you index, so Fusion can understand meaning and not only keywords. That powers more relevant results in both B2B and B2C scenarios.

B2B use cases

By understanding the meaning behind a query, employees can more quickly find the right document or ticket. Here are some examples:
  • A query for “reset SSO after IdP change” surfaces the exact runbook even if the word “reset” is not included in that runbook.
  • Support tickets can be resolved more quickly by semantically matching new tickets to past resolutions and knowledge articles. A “VPN times out on hotel Wi-Fi” ticket can auto-suggest a workaround article plus several resolved cases.

B2C use cases

For consumer experiences, the LWAI Vectorize Field stage improves product and content discovery, increases click-through rates, and reduces zero-results by understanding natural-language intent. Here are some examples:
  • A search for “comfy summer shoes for walking” returns breathable walking sneakers even when product titles lack those exact words.
  • A search for “black backpack for travel” can return carry-on backpacks with laptop sleeves.

Set up stage

To use this stage, non-admin Fusion users must be granted the PUT,POST,GET:/LWAI-ACCOUNT-NAME/** permission in Fusion, which is the Lucidworks AI API Account Name defined in Lucidworks AI Gateway when this stage is configured. More detailed information to configure this stage is in Configure Neural Hybrid Search.

Configurable vector quantization method

In Fusion 5.9.13 and up, you can configure the vector quantization method. Quantization converts high-precision float vectors into compact 8-bit integer vectors, significantly lowering storage and compute costs. By default, no quantization is performed; you enable it by selecting a method. To select the quantization method, go to Model Configuration in the stage configuration and enter the vectorQuantizationMethod parameter with the value for the desired method: Vector quantization method configuration in an LWAI pipeline stage Available methods are:
  • min-max creates tensors of embeddings and converts them to uint8 by normalizing them to the range [0, 255].
    This method loses precision when evaluated against non-quantized vectors.
    Test it against your data to see if the loss is acceptable.
  • max-scale finds the maximum absolute value along each embedding, normalizes the embeddings by scaling them to a range of -127 to 127, and returns the quantized embeddings as an 8-bit integer tensor.
    This method has no loss at the ten-thousandths place during evaluation against non-quantized vectors.

Matryoshka vector dimension reduction configuration

Vector dimension reduction is the process of making the default vector size of a model smaller. The purpose of this reduction is to lessen the burden of storing large vectors while still achieving the good quality of a larger model. The technique is called Matryoshka Representation Learning (MRL) and lets you reduce vector size while maintaining good quality. To select the vector dimension reduction method, go to Model Configuration in the stage configuration and enter the dimReductionSize parameter with the value for the desired method: Matryoshka vector dimension reduction configuration in an LWAI pipeline stage The value allows any integer above 0, but less than or equal to the vector dimension of the model.

Configuration

When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.