Skip to main content
Managed Fusion 5.9.12 and later integrates with Lucidworks AI to perform chunking. Chunking helps your organization enhance relevance in searches and provides customers with exact information without the need to scan large documents. In addition, chunking improves the accuracy of AI assistants by delivering semantically-rich training data in precise, context-aware pieces. When you include chunking in your index pipeline, Lucidworks AI automatically splits large documents into smaller, more focused segments. This approach is especially powerful when paired with Neural Hybrid Search (NHS) to surface the most relevant chunks instead of entire documents. Each chunk is a unique entity in the index that contains metadata and a reference to the original document. In addition, the chunk generates its own vector, resulting in multiple vectors that collectively represent a field from a single parent document. This is useful for large documents that exceed maximum vector dimensions. Click your use case below to see examples of how chunking can enhance the search experience:
  • Break product descriptions into focused chunks so customers can find relevant details faster.
  • Reduce support tickets by training AI assistants on semantically-segmented help articles for more accurate answers.
  • Split multimedia transcripts (like product videos or webinars) into meaningful chunks so customers can find answers in content they wouldn’t normally read.

Chunking stages

Chunking stages are the phases in document ingestion and processing where large documents are split into smaller, meaningful pieces before being embedded or indexed. These stages work together to make chunk-based search operations efficient. Based on the configuration you deploy, the stages assess and prepare content during indexing, translate user queries into vectors at search time, and combine semantic and lexical ranking to return accurate results from large or complex content collections. Chunking stages include the following:
  • LWAI Chunker Index asynchronous stage prepares documents for semantic retrieval at ingestion time, which can reduce manual processing and improve granular-level content searches. Specifically, the stage breaks down large text documents into smaller, semantically meaningful chunks, vectorizes those chunks for NHS, and stores those vectors in Solr. For information to set up the stage, see Set up LWAI Chunker index pipeline stage.
  • LWAI Vectorize Query stage converts natural-language user queries into vectors that retrieve semantically-similar chunks that indicate customer intent. Exact matching is not necessary for this stage to return highly relevant results. To accomplish this, the stage generates a vector based on the current query string (q parameter). For information to set up the stage, see Set up LWAI Vectorize query stage.
  • Chunking Neural Hybrid Query stage helps your organization balance semantic searches and exact-match (lexical) relevance to improve ranking quality while accommodating the two ways users enter search terms. The stage performs hybrid lexical-semantic searches using Solr. For information to set up the stage, see Set up Chunking Neural Hybrid Query pipeline stage.
You can also use the Async Chunking API that asynchronously separates large pieces of text into chunks. The API then returns the chunks and their associated vectors. Using this API gives your organization the option to perform chunking actions outside of standard Fusion pipelines, and can be used for custom applications and specialized processing workflows.

Chunking strategies

Selecting the correct chunking strategy helps your organization manage system performance, and improve document and information retrieval while preserving content meaning. Chunking works by limiting the context length to 512 tokens. Ideally, a chunk represents a complete thought or idea and is usually a sentence or two in length. Chunking should also balance computational efficiency. For example, you should be careful to not generate too many chunks per document, because each chunk is represented by a vector of O(1000) floats, which can affect performance and resource usage.
There are limits to both the request and response payloads sent to the LWAI Chunker from Managed Fusion. Currently Managed Fusion truncates the body of text sent to Lucidworks AI for chunking to 50,000 characters (O(100) pages).
Chunking strategies define how the text is divided into chunks during a chunking stage or when used in the Async Chunking API. Chunking strategies include the following:
  • dynamic-newline chunker is most effective with line-based or minimally-structured content that is already clearly separated. This strategy splits the provided text on all newline characters. Then all of the split chunks under the maxChunkSize limit will be merged.
  • dynamic-sentence chunker splits the provided text into sentences while keeping content readable, so it can be easily searched and retrieved. Sentences are joined until they reach hunkSize. If overlapSize` is provided, adjacent chunks overlap by that many sentences.
  • regex-splitter chunker splits highly-structured text such as technical templates and forms, or documents that contain repeated patterns. This strategy uses the specified regular expression (regex), according to the conventions employed by the re python module.
  • semantic chunker is effective for highly-detailed content and splits text into sentences, encodes the sentences, and then compares the sentence to the building chunk to determine if they are similar enough to group together. It then merges two similar sentences and continues to encode and merge similar sentences.
  • sentence chunker splits text on sentences.

Benefits of chunking

Chunking helps search because smaller pieces of text are easier to match accurately with a user’s query. When the AI searches through chunks instead of whole documents, it avoids irrelevant content and focuses only on the most relevant parts. This reduces noise, improves precision, and increases the chances of finding the exact answer or context the user needs. Chunking helps retrieval-augmented generation (RAG) by giving the system smaller, focused pieces of information to choose from when answering a question. Instead of pulling in a whole document, RAG can retrieve just the chunks that are most relevant. This makes the answer more accurate because the model is only looking at the parts that actually match the question. It also reduces the chance of including unrelated or confusing content in the final response.

Chunking process diagram

This diagram displays how Fusion and Lucidworks AI work together to process, vectorize, and retrieve chunked content. Chunking process diagram

Prerequisites

These requirements ensure reliable operations to chunk, store, and process chunked data such as vectors and metadata. The collection must have a processor component added to the Solr schema: <processor class="solr.lw.MultiVectorsToChildDocsProcessorFactory" />. Here is an example of a default in the solrconfig.xml. After you add this, you must clear the collection and re-index.
<!-- FUSION NOTES: Fusion's default is not to use Solr's schemaless mode.
  Fusion uses the Solr Dynamic Field Name Mapping index pipeline stage to automatically convert all incoming fields to dynamic fields. Fusion's default schema supports this with multiple dynamic field rules.
  If Solr's schemaless mode is preferred, it must be configured in a new updateRequestProcessorChain and the "default" param below changed to false. -->
<updateRequestProcessorChain default="true">
    <processor class="solr.IgnoreCommitOptimizeUpdateProcessorFactory">
      <str name="responseMessage">Optimize requests are not allowed in this Solr instance.</str>
      <int name="statusCode">400</int>
      <bool name="ignoreOptimizeOnly">true</bool>
    </processor>
    <processor class="solr.UUIDUpdateProcessorFactory" />
    <processor class="solr.lw.MultiVectorsToChildDocsProcessorFactory" />
    <processor class="solr.LogUpdateProcessorFactory"/>
    <processor class="solr.DistributedUpdateProcessorFactory"/>
    <processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>
If your organization uses chunking in NHS, keyword search is combined with semantic understanding at the chunk level. This approach is very effective where only portions of documents contain the answers to user queries. You can set up NHS to index, rank, and retrieve documents based on a combination of lexical and chunked vectors. For more information, see the Lucidworks AI Async Chunking API.
If you have a Ray deployment, you can use the Local Chunker Index stage for chunking.
In order to set up the Lucidworks AI index and query stages, you need to first set up your Configure A Lucidworks AI Gateway Integration. This guide also assumes that you’ve set up and configured a datasource.
Before you can use Lucidworks AI with Lucidworks Platform, you must configure the Lucidworks AI Gateway to provide a secure, authenticated integration between self-hosted Fusion and your hosted models. This configuration is done through a secret properties file that you can find in the Lucidworks Platform UI.
This feature is available starting in Fusion 5.9.5 and in all subsequent Fusion 5.9 releases.
Integrations are created for you by the Lucidworks team. But as a workspace owner, you can configure those integrations with Lucidworks AI Gateway. Each account can have its own set of credentials and associated scopes, which define the operations it can perform. If configuration properties are not provided at the account level, default settings are used instead.To configure the Lucidworks AI Gateway, navigate to the megamenu and click Models.
  1. On the Integrations tab, click your integration. If you don’t see your integration, contact your Lucidworks representative.
  2. Download or copy the YAML code and paste it into a file called account.yaml. The file for a single integration should look similar to this one:
    lwai-gateway:
     lwai:
      credentials: |
         fusion.lwai.default.baseUrl: https://APPLICATION_ID.applications.lucidworks.com
         fusion.lwai.default.authEndpoint: https://identity.lucidworks.com/oauth2/XXXXXXXXXX/v1/token
         fusion.lwai.account[0].name: ACCOUNT_NAME
         fusion.lwai.account[0].scopes: machinelearning.predict,machinelearning.model
         fusion.lwai.account[0].clientId: *****
         fusion.lwai.account[0].clientSecret: *****
    
    For a configuration with multiple integrations, it should look like this:
    lwai-gateway:
     lwai:
      credentials: |
         fusion.lwai.default.authEndpoint: https://identity.lucidworks.com/oauth2/XXXXXXXXXX/v1/token
         fusion.lwai.account[0].baseUrl: https://APPLICATION_ID.applications.lucidworks.com
         fusion.lwai.account[0].name: ACCOUNT_NAME
         fusion.lwai.account[0].scopes: machinelearning.predict,machinelearning.model
         fusion.lwai.account[0].clientId: *****
         fusion.lwai.account[0].clientSecret: *****
    
         fusion.lwai.account[1].baseUrl: https://APPLICATION_ID2.applications.lucidworks.com
         fusion.lwai.account[1].name: ACCOUNT_NAME
         fusion.lwai.account[1].scopes: machinelearning.predict,machinelearning.model
         fusion.lwai.account[1].clientId: *****
         fusion.lwai.account[1].clientSecret: *****
    
    Non-admin users must have the following permissions to use Lucidworks AI integrations:
    PUT,POST,GET:/LWAI-ACCOUNT-NAME/** where LWAI-ACCOUNT-NAME must match the value of fusion.lwai.account[n].name in the integration YAML.
  3. Apply the file to your Fusion configuration file. For example:
    helm upgrade KUBERNETES_NAMESPACE lucidworks/fusion -f FUSION_VALUES.yaml
    
Setting up chunking in NHS is similar to a standard Configure Neural Hybrid Search implementation, except that the LWAI Chunker Stage replaces the LWAI Vectorize Field stage and the Chunking Neural Hybrid Query stage replaces the Neural Hybrid Query or Hybrid Query stages.
Click Get Started below to see how to enable chunking in Managed Fusion:

Set up LWAI Chunker index pipeline stage

This stage asynchronously chunks and vectorizes data, and stores those vectors in Solr. As with all chunking and vectorization, this uses system resources effectively and improves the relevancy of results.
  1. Sign into Fusion, go to Indexing > Index Pipelines, then select an existing pipeline or create a new one.
  2. Click Add a new pipeline stage, then select LWAI Chunker Stage. For reference information, see LWAI Chunker Index Stage.
  3. In the Account Name field, select the Lucidworks AI API account name defined in Lucidworks AI Gateway.
  4. In the Chunking Strategy field, select the strategy to use. For example, sentence.
  5. In the Model for Vectorization field, select the Lucidworks AI model to use for encoding. For more information, see:
  6. In the Input context variable field, enter the variable in context to be used as input. This field supports template expressions.
  7. In the Source field, enter the name of the string field where the value should be submitted to the model for encoding. If the field is blank or does not exist, this stage is not processed. Template expressions are supported.
  8. In the Destination Field Name & Context Output field, enter the name of the field where the vector value from the model response is saved.
This field must contain chunk_vector and must be a dense vector field type. This field is used to populate two things with the prediction results:
  1. The field name in the document that will contain the prediction
  2. The name of the context variable that will contain the prediction.
  1. In the Destination Field Name for Text Chunks (not the vectors) field, enter the field name that will contain the text chunks that are generated by the chunker. For example, body_chunks_ss.
  2. In the Chunker Configuration section, click the + sign to enter the parameter name and value for additional chunker keys to send to Lucidworks AI. For example, to limit the chunk size to two sentences, enter chunkSize and 2, respectively.
  3. In the Model Configuration section, click the + sign to enter the parameter name and value for additional model configurations to send to Lucidworks AI. Several modelConfig parameters are common to generative AI use cases.
  4. In the API Key field, enter the secret associated with the model. For example, for OpenAI models, the value would start with sk-.
  5. In the Maximum Asynchronous Call Tries field, enter the maximum number of attempts to issue an asynchronous Lucidworks AI API call. The default value is 3.
  6. Select the Fail on Error checkbox to generate an exception if an error occurs while generating a prediction for a document.
  7. Click Save.
Additional requirements for the stage are:
  • Use a V2 connector. Only V2 connectors work for this task and not other options, such as PBL or V1 connectors.
  • Remove the Apache Tika stage from your parser because it can cause datasource failures with the following error: “The following components failed: [class com.lucidworks.connectors.service.components.job.processor.DefaultDataProcessor : Only Tika Container parser can support Async Parsing.]”

Set up Solr Partial Update Indexer stage

Fusion’s asynchronous chunking process is optimized for efficiency and reliability. To achieve this, it leverages the Solr Partial Update Indexer stage and a single index pipeline visited twice.

Chunking workflow using the Solr Partial Update Indexer stage

Chunking is a multi-step process that can slow indexing before a response is received during synchronous processing. So asynchronous processing, including partial updates, completes processes as efficiently as possible when system performance allows. This asynchronous chunking workflow may not index the original document and the chunking data derived from it at the same time. The typical indexing process for chunking data is as follows:
  1. The document is ingested into the index pipeline and indexed in Solr.
  2. The chunking stage assesses the document and then generates multiple semantic chunks and vector embeddings.
  3. The chunks are processed asynchronously, and then returned to the same pipeline.
  4. The chunk data is written back to the existing Solr document. Because the chunk updates may not be processed in the original order, and occur after indexing the original document, the pipeline must allow incremental updates using the partial updater process.
  5. The Solr Partial Update Indexer processes documents and generates additional chunks and vector embeddings.
  6. Any updates are applied to the existing Solr document.

Configure the Solr Partial Update Indexer stage

Complete these steps to ensure the incremental or partial updates for chunking are configured in the Solr Partial Update Indexer stage. These settings most closely mirror regular Solr indexing functionality.
  1. In the same pipeline, click Add a new pipeline stage, then select Solr Partial Update Indexer.
  2. Disable Map to Solr Schema. Disable schema mapping so the fields and vector embeddings added by the chunking stage are not filtered out.
  3. Disable Enable Concurrency Control. Chunking updates can be generated and submitted at any time asynchronously. If this field is not disabled, multiple asynchronous updates to the same document may fail.
  4. Disable Reject Update if Solr Document is not Present. Occasionally, a chunk update is processed before the original document is fully committed. If this is not disabled, the chunk update fails.
  5. Enable Process All Pipeline Doc Fields. Enable this field to ensure the new fields added by the chunking stages are included in the partial update request.
  6. Enable Allow reserved fields. Enable this field to ensure internal fields names reserved for chunking workflows are processed correctly in the pipeline.
  7. Click Save.
When this stage is configured, index data using the new pipeline.
IMPORTANT: Other indexing workflows may require different settings for the Solr Partial Update Indexer stage. For example, if updates must reflect the most recent state, enable concurrency control. Or if each update includes the complete document state, partial updates aren’t necessary. For more information, see Solr Partial Update Indexer.

Set up LWAI Vectorize Query stage

This stage converts natural-language user queries into vectors that retrieve semantically-similar chunks that indicate customer intent. Exact matching is not necessary for this stage to return highly relevant results. To accomplish this, the stage generates a vector based on the current query string (q parameter).
  1. Go to Querying > Query Pipelines, then select an existing pipeline or create a new one.
  2. To vectorize text, click Add a new pipeline stage.
  3. Click Add a new pipeline stage, then select LWAI Vectorize Query.
  4. In the Label field, enter a unique identifier for this stage.
  5. In the Condition field, enter a script that results in true or false, which determines if the stage should process.
  6. Select Asynchronous Execution Config if you want to run this stage asynchronously. If this field is enabled, complete the following fields:
    1. Select Enable Async Execution. Fusion automatically assigns an Async ID value to this stage. Change this to a more memorable string that describes the asynchronous stages you are merging, such as signals or access_control.
    2. Copy the Async ID value.
    For detailed information, see Asynchronous query pipeline processing.
  7. In the Account Name field, select the name of the Lucidworks AI account.
  8. In the Model field, select the Lucidworks AI model to use for encoding.
  9. In the Query Input field, enter the location from which the query is retrieved.
  10. In the Output context variable field, enter the name of the variable where the vector value from the response is saved.
  11. In the Use Case Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. The useCaseConfig parameter that is common to generative AI and embedding use cases is dataType, but each use case may have other parameters. The value for the query stage is query.
  12. In the Model Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. Several modelConfig parameters are common to generative AI use cases. For more information, see Prediction API.
  13. Select the Fail on Error checkbox to generate an exception if an error occurs during this stage.
  14. Click Save.

Set up Chunking Neural Hybrid Query pipeline stage

This stage helps your organization balance semantic searches and exact-match (lexical) relevance to improve ranking quality while accommodating the two ways users enter search terms. The stage performs hybrid lexical-semantic searches using Solr.
  1. In the same query pipeline where you configured LWAI Vectorize Query stage, click Add a new pipeline stage, then select Chunking Neural Hybrid Query Stage. For reference information, see Chunking Neural Hybrid Query Stage.
  2. In the Lexical Query Input field, enter the location from which the lexical query is retrieved. For example, <request.params.q>. Template expressions are supported.
  3. In the Lexical Query Weight field, enter the relative weight of the lexical query. For example, 0.3. If this value is 0, no re-ranking will be applied using the lexical query scores.
  4. In the Lexical Query Squash Factor field, enter a value that will be used to squash the lexical query score. For this value, Lucidworks recommends entering the inverse of the lexical maximum score across all queries for the given collection.
  5. In the Vector Query Field, enter the name of the Solr field for k-nearest neighbor (KNN) vector search. For example, body_chunk_vector_384v.
  6. In the Vector Input field, enter the location from which the vector is retrieved. Template expressions are supported. For example, a value of <ctx.vector> evaluates the context variable resulting from the LWAI Vectorize Query stage.
  7. In the Vector Query Weight field, enter the relative weight of the vector query. For example, 0.7.
  8. In the Min Return Vector Similarity field, enter the minimum vector similarity value to qualify as a match from the Vector portion of the hybrid query.
  9. In the Min Traversal Vector Similarity field, enter the minimum vector similarity value to use when walking through the graph during the Vector portion of the hybrid query.
  10. Select the checkbox to enable the Compute Vector Similarity for Lexical-Only Matches setting. When enabled, this setting computes vector similarity scores for documents in lexical search results but not in the initial vector search results.
  11. Select the checkbox to enable the Block pre-filtering setting. When enabled, this setting prevents pre-filtering that can interfere with facets and cause other issues.
  12. Click Save.

Validate chunking in the Query Workbench

Once configured, go to the Query Workbench to run some queries and check that vectorization and chunking are working properly before deploying them to your production environment. If you facet by the vector query field (in this example, body_chunk_vector_384v) you see your indexed documents have vectors. fusion chunking vectors
If you have a large dataset with thousands of docs, you should set this field to stored=false. Storing vectors in Solr for that many docs can results in memory issues. Refer to the Solr documentation on override properties for more information.
If you facet by _lw_chunk_root, you see body_chunk_ss. In this example, the chunk size is limited to two sentences, so this document has 29 chunks of two sentences each. fusion chunking response