Tokenization by MODEL_ID

import requests

url = "https://application_id.applications.lucidworks.com/ai/tokenization/{MODEL_ID}"

payload = {
    "batch": [{ "text": "Mr. and Mrs. Dursley and O'\''Malley, of number four, Privet Drive, were proud to say that they were perfectly normal, thank you very much" }],
    "useCaseConfig": { "dataType": "query or passage" },
    "modelConfig": {
        "vectorQuantizationMethod": "max-scale",
        "dimReductionSize": 256
    }
}
headers = {"Content-Type": "application/json"}

response = requests.post(url, json=payload, headers=headers)

print(response.json())

{
  "generatedTokens": [
    {
      "tokens": [
        "\"[CLS]\",                 \"query\",                 \":\",                 \"mr\",                 \".\",                 \"and\",                 \"mrs\",                 \".\",                 \"du\",                 \"##rs\",                 \"##ley\",                 \"and\",                 \"o\",                 \"'\",                 \"malley\",                 \",\",                 \"of\",                 \"number\",                 \"four\",                 \",\",                 \"pri\",                 \"##vet\",                 \"drive\",                 \",\",                 \"were\",                 \"proud\",                 \"to\",                 \"say\",                 \"that\",                 \"they\",                 \"were\",                 \"perfectly\",                 \"normal\",                 \",\",                 \"thank\",                 \"you\",                 \"very\",                 \"much\",                 \".\",                 \"[SEP]\""
      ]
    }
  ],
  "tokensUsed": {
    "inputTokens": 40,
    "promptTokens": 148,
    "completionTokens": 0,
    "totalTokens": 175
  }
}

POST

tokenization

{MODEL_ID}

Tokenization by MODEL_ID

import requests

url = "https://application_id.applications.lucidworks.com/ai/tokenization/{MODEL_ID}"

payload = {
    "batch": [{ "text": "Mr. and Mrs. Dursley and O'\''Malley, of number four, Privet Drive, were proud to say that they were perfectly normal, thank you very much" }],
    "useCaseConfig": { "dataType": "query or passage" },
    "modelConfig": {
        "vectorQuantizationMethod": "max-scale",
        "dimReductionSize": 256
    }
}
headers = {"Content-Type": "application/json"}

response = requests.post(url, json=payload, headers=headers)

print(response.json())

{
  "generatedTokens": [
    {
      "tokens": [
        "\"[CLS]\",                 \"query\",                 \":\",                 \"mr\",                 \".\",                 \"and\",                 \"mrs\",                 \".\",                 \"du\",                 \"##rs\",                 \"##ley\",                 \"and\",                 \"o\",                 \"'\",                 \"malley\",                 \",\",                 \"of\",                 \"number\",                 \"four\",                 \",\",                 \"pri\",                 \"##vet\",                 \"drive\",                 \",\",                 \"were\",                 \"proud\",                 \"to\",                 \"say\",                 \"that\",                 \"they\",                 \"were\",                 \"perfectly\",                 \"normal\",                 \",\",                 \"thank\",                 \"you\",                 \"very\",                 \"much\",                 \".\",                 \"[SEP]\""
      ]
    }
  ],
  "tokensUsed": {
    "inputTokens": 40,
    "promptTokens": 148,
    "completionTokens": 0,
    "totalTokens": 175
  }
}

Headers

Authorization: Bearer ACCESS_TOKEN

string

The authentication and authorization access token.

Content-Type

string

application/json

Example:

"application/json"

Path Parameters

MODEL_ID

string

required

The name of the pre-trained or custom embedding model.

Example:

"e5-small-v2"

Body

application/json

batch

object[]

The text used as the input for the request.

Show child attributes

useCaseConfig

object

Show child attributes

modelConfig

object

Show child attributes

Response

200 - application/json

generatedTokens

object[]

Show child attributes

tokensUsed

object

Show child attributes

Purchase complete signals

⌘I

Models

Authentication

Use Case

Predict

Async Chunking

Async Predict

Prompt Preview

Signals

Tokenization

Tokenization by MODEL_ID

Headers

Path Parameters

Body

Response