Async Chunking API

The Lucidworks AI Async Chunking API asynchronously separates large pieces of text into smaller pieces, called chunks. The API then returns the chunks and their associated vectors. Currently, the maximum text size allowed for input is approximately 1 MB. Breaking text into chunks can produce a significant number of chunks, especially if there is overlap between chunks or small chunk sizes, so there are limits on how many chunks and vectors can be generated. These limits are based on factors such as the dimension size of the embedding model and whether vector quantization is used. The Async Chunking API contains two requests:

POST Request. This request is used to submit text for a chunking strategy and model. Upon submission, the API responds with the following information:
- chunkingId that is a unique UUID for the submitted chunking task, and can be used later to retrieve the results.
- status that indicates the current state of the chunking task.
GET Request. This request is used to retrieve the results of a previously-submitted chunking request. You must provide the unique chunkingId received from the POST request. The API then returns the results of the chunking request associated with that chunkingId.

For more information, see the API specification.

Instead of the API, you can use index and query stages for chunking. Chunking stages are the phases in document ingestion and processing where large documents are split into smaller, meaningful pieces before being embedded or indexed. For more information about Fusion chunking, chunking strategies, and setting up chunking, see Fusion Chunking. For information about Managed Fusion chunking, chunking strategies, and setting up chunking, see Managed Fusion Chunking.

Chunking strategies (chunkers)

Chunking strategies define how the text is divided into chunks during a chunking stage or when used in the Async Chunking API. There are five chunking strategies (chunkers) available in the Async Chunking API. Each chunker splits and processes submitted text differently.

dynamic-newline
dynamic-sentence
regex-splitter
semantic
sentence

Prerequisites

To use this API, you need:

The unique APPLICATION_ID for your Lucidworks AI application. For more information, see credentials to use APIs.
A bearer token generated with a scope value of machinelearning.predict. For more information, see Authentication API.
The CHUNKER and MODEL_ID fields for the use case request. The path is: /ai/async-chunking/CHUNKER/MODEL_ID. A list of supported models is returned in the Lucidworks AI Use Case API.

Common parameters and fields

Some parameters in the /ai/async-chunking/CHUNKER/MODEL_ID request are common to all of the Async Chunking API requests, such as the modelConfig parameter. Also referred to as hyperparameters, these fields set certain controls on the response. Refer to the API spec for more information.

Vector quantization

To process large chunks of text efficiently, Lucidworks recommends you enter the appropriate value in the "modelConfig": "vectorQuantizationMethod" field to ensure that as much of the text as possible is chunked, even for large inputs. Quantized vectors are less resource intensive to store and compute, which decreases index and query processing time. In addition, due to their size, more quantized vectors can be used to reach the same amount of memory as typical vectors. For example, if the quantized vector sizes are [1,0,2], [2,3,1], [6,0,0], [0,0,2] and typical vectors are [0.012341,0.23434,0.01334], [0.5434,0.02134,0.05434], [0.76534,0.0953,0.1334], [0.398,0.38574,0.01384], and the amount of memory is 5MB: The quantized vector can have 5000 vectors that reach 5MB because they are smaller. But typical vectors can only have 500 because they have more numerical values to save in memory. In this example, the numbers 5MB, 5000, and 500 are random numbers. The following table specifies the number of chunks returned for a single request based on vector dimension and the setting of vector quantization.

Vector Dimension Size	Maximum Chunks Returned if Quantized Vector = true	Maximum Chunks if Quantized Vector = false
32	40000	11000
64	22500	5800
128	12000	3000
256	6500	1500
384	4500	1000
512	3250	750
768	2250	500
1024	1700	380
1536	250	250
2048	850	190

useCaseConfig

The "useCaseConfig": "dataType": "string" parameter is common to all of the Async Chunking API chunkers in the /ai/async-chunking/CHUNKER/MODEL_ID request. If you do not enter the value, the default of query is used. This optional parameter enables model-specific handling in the Async Chunking API to help improve model accuracy. Use the most applicable fields based on available dataTypes and the dataType value that best aligns with the text sent to the Async Chunking API. The string values to use are:

"dataType": "query" for the query.
"dataType": "passage" for fields searched at query time.

The syntax example is:

"useCaseConfig":
  {
    "dataType": "query"
  }

Unique parameters and fields

chunkerConfig

The parameters to configure each chunker are as follows:

dynamic-newline chunker

The dynamic-newline chunker splits the provided text on all newline characters. Then all of the split chunks under the maxChunkSize limit will be merged. If you call the Async Chunking API with the dynamic-newline chunker and you don’t include or alter the chunkerConfig object, the service uses the following defaults.

"chunkerConfig": "maxChunkSize" - This integer field defines the maximum token limit for a chunker. The default is 512 tokens, which matches the maximum context size of the Lucidworks-hosted embedding models.
```
"chunkerConfig": {
"maxChunkSize": 512
    }
```

dynamic-sentence chunker

The dynamic-sentence chunker splits the provided text into sentences. Sentences are joined until they reach the maxChunkSize. If overlapSize is provided, adjacent chunks overlap by that many sentences. Example:

Chunk 1: Sentence 1, Sentence 2, Sentence 3
Chunk 2: Sentence 3, Sentence 4, Sentence 5
Chunk 3: Sentence 5, Sentence 6, Sentence 7

If you call the Async Chunking API with the dynamic-sentence chunker and you don’t include or alter the chunkerConfig object, the service uses the following defaults.

"chunkerConfig": "maxChunkSize" - This integer field defines the maximum token limit for a chunker. The default is 512 tokens, which matches the maximum context size of the Lucidworks-hosted embedding models.
```
{
  "chunkerConfig": {
    "maxChunkSize": 512
  }
}
```
"chunkerConfig": "overlapSize" - This integer field sets the number of sentences that can overlap between consecutive chunks. The default is 1 sentence for most configurations.
```
"chunkerConfig": {
"overlapSize": 1
    }
```

regex-splitter chunker

The regex-splitter chunker splits the submitted text based on the specified regular expression (regex), according to the conventions employed by the re python module. For more information about the regular expression operations used in this module, see https://docs.python.org/3/library/re.html. If you call the Async Chunking API with the regex-splitter chunker and you don’t include or alter the chunkerConfig object, the service uses the following defaults.

"chunkerConfig": "regex" - This field sets the regular expression used to split the provided text. For example, \\n.
```
"chunkerConfig": {
"regex": "\\n"
    }
```

semantic chunker

The semantic chunker creates chunks based on semantic similarity. Using the model defined in the URL request, the semantic chunker splits text into sentences, encodes the sentences, and then compares the sentence to the building chunk to determine if they are similar enough to group together. After merging two semantically-similar sentences into a pre-chunk, the semantic chunker needs to encode it to get its vector to compare with the next sentence vector. This chunker is the slowest of all of the chunkers even if you set the approximate field to true. If you call the Async Chunking API with the semantic chunker and you don’t include or alter the chunkerConfig object, the service uses the following defaults.

"chunkerConfig": "maxChunkSize" - This integer field defines the maximum token limit for a chunker. The default is 512 tokens, which matches the maximum context size of the Lucidworks-hosted embedding models.
```
"chunkerConfig": {
"maxChunkSize": 512
    }
```
"chunkerConfig": "overlapSize" - This integer field sets the number of sentences that can overlap between consecutive chunks. The default is 1 sentence for most configurations.
```
"chunkerConfig": {
"overlapSize": 1
    }
```
"chunkerConfig": "cosineThreshold" - This decimal field controls how similar a sentence must be to a chunk (based on cosine similarity), in order for the sentence to be merged into the chunk. This value is a decimal between 0 and 1. The default threshold is 0.5.
```
"chunkerConfig": {
  "cosineThreshold": 0.5
}
```
"chunkerConfig": "approximate" - If this boolean field is set to true, the semantic chunker does not encode the split text to get its vector to compare with the next sentence vector. This greatly increases processing time with no loss in the result quality. However, even with the ability to specify true in the approximate field, the semantic chunker is the slowest of all the chunkers. If this field is set to false, the semantic chunking is, on average, 5 times slower than when set to true, with very minimal or no precision increase.
```
"chunkerConfig": {
  "approximate": true
}
```

sentence chunker

The sentence chunker splits text on sentences. If you call the Async Chunking API with the sentence chunker and you don’t include or alter the chunkerConfig object, the service uses the following defaults.

"chunkerConfig": "chunkSize" - This integer field sets the maximum number of sentences per chunk. The default is 5.
```
"chunkerConfig": {
"chunkSize": 5
    }
```
"chunkerConfig": "overlapSize" - This integer field sets the number of sentences that can overlap between consecutive chunks. The default is 1 sentence for most configurations.
```
"chunkerConfig": {
"overlapSize": 1
    }
```

POST request

The following is an example of the POST request used by every chunker. Fields and values unique to each chunker are detailed in Unique parameters and fields. These are the possible values of the status of a request:

SUBMITTED. The POST request was successful and the response has returned the chunkingId and status that is used by the GET request.
ERROR. An error was generated when the GET request was sent.
READY. The results associated with the chunkingId are available and ready to be retrieved.
RETRIEVED. The results associated with the chunkingId are returned successfully when the GET request was sent.

curl --request POST \
  --url https://APPLICATION_ID.applications.lucidworks.com/ai/async-chunking/{CHUNKER}/{MODEL_ID} \
  --header 'Authorization: Bearer ACCESS_TOKEN' \
  --header 'Content-Type: application/json' \
  --data '{
    "batch": [
      {
        "text": "The itsy bitsy spider climbed up the waterspout.\nDown came the rain.\nAnd washed the spider out.\nOut came the sun.\nAnd dried up all the rain.\nAnd the itsy bitsy spider climbed up the spout again."
      }
    ],
    "useCaseConfig": {
      "dataType": "query"
    },
    "modelConfig": {
      "vectorQuantizationMethod": "max-scale"
    }
  }'

GET request

To retrieve the chunked results, use the chunkingId from the POST response in a GET request. The following is an example of the GET request used by every chunker.

curl --request GET \
  --url https://APPLICATION_ID.applications.lucidworks.com/ai/async-chunking/{CHUNKING_ID} \
  --header 'Authorization: Bearer ACCESS_TOKEN' \
  --header 'Content-type: application/json'

Get Started

Lucidworks Platform

Lucidworks AI

Core Settings

Agent Studio

Commerce Studio

Analytics Studio

Chunking strategies (chunkers)

Prerequisites

Common parameters and fields

Vector quantization

useCaseConfig

Unique parameters and fields

chunkerConfig

dynamic-newline chunker

dynamic-sentence chunker

regex-splitter chunker

semantic chunker

sentence chunker

POST request

GET request

Get Started

Lucidworks Platform

Lucidworks AI

Core Settings

Agent Studio

Commerce Studio

Analytics Studio

​Chunking strategies (chunkers)

​Prerequisites

​Common parameters and fields

​Vector quantization

​useCaseConfig

​Unique parameters and fields

​chunkerConfig

​dynamic-newline chunker

​dynamic-sentence chunker

​regex-splitter chunker

​semantic chunker

​sentence chunker

​POST request

​GET request

Chunking strategies (chunkers)

Prerequisites

Common parameters and fields

Vector quantization

useCaseConfig

Unique parameters and fields

chunkerConfig

dynamic-newline chunker

dynamic-sentence chunker

regex-splitter chunker

semantic chunker

sentence chunker

POST request

GET request