> ## Documentation Index
> Fetch the complete documentation index at: https://doc.lucidworks.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Models API

export const LwTemplate = ({title = "Key questions to get you started", icon = "sparkles", cta = "Powered by Agent Studio", linkHref = "https://lucidworks.com/demo/?utm_source=docs&utm_medium=referral&utm_campaign=docs_cta_ai"}) => {
  const [isLoaded, setIsLoaded] = useState(false);
  useEffect(() => {
    const timer = setTimeout(() => {
      setIsLoaded(true);
    }, 500);
    return () => clearTimeout(timer);
  }, []);
  return <div className="lw-template-container">
      <Card title={title} icon={icon}>
        {isLoaded && <span dangerouslySetInnerHTML={{
    __html: `<lw-template id="a029c1a9-28be-427e-b0e1-5d918920246a"></lw-template
            >`
  }} />}
        <Link href={linkHref} className="agent-studio-link text-left text-gray-600 gap-2 dark:text-gray-400 text-sm font-medium flex flex-row items-center hover:text-primary dark:hover:text-primary-light group-hover:text-primary group-hover:dark:text-primary-light">Powered by Lucidworks Agent Studio</Link>
      </Card>
    </div>;
};

[old doc.lw link]: https://doc.lucidworks.com/lw-platform/ai/e63rmi

[localhost link]: http://localhost:3000/docs/lw-platform/lw-ai/lw-ai-apis/lw-ai-models-api

[mintlify link]: https://doc.lucidworks.com/docs/lw-platform/lw-ai/lw-ai-apis/lw-ai-models-api

The Lucidworks AI Models API is used to manage custom models.

<LwTemplate />

## Prerequisites

To use this API, you need:

* The unique `CUSTOMER_ID` for your organization.

  <Note>The Customer ID is not displayed on the Integrations screen. It is the unique value that identifies your organization, and is also required for API usage. Lucidworks will provide that value to your organization. The Customer ID is generated when the workspace is created and is unique to that workspace. It cannot be reset or changed.</Note>

* A bearer token generated with a scope value of `machinelearning.model`. For more information, see [Authentication API](/docs/lw-platform/lw-platform/authentication-api).

* Other result-specific fields such as `MODEL_ID` and `DEPLOYMENT_ID` for certain operations.

  You can get these values in Lucidworks Platform. Go to **Models** > **Custom Models**, then click your model and go to **Deployment Details**.

## Training configuration

General and ecommerce recurrent neural network (RNN) models are supported.

For detailed information about training parameters and configuration, click **View API specification**.

### Training data format

The `catalog` and `signals` training data require a shared primary key id `pkid` in both the:

* `index` file that contains documents or products that are searched
* `query` file that contains query data associated with the index documents

### Text processors

The supported text processors are:

* **Word** that contains a default of pre-trained English word tokenization and embeddings. The general RNN model defaults to this processor. The ecommerce RNN model uses this processor, and fine tunes the embeddings during training.

  This processor sets text to lowercase and numbers are split into single digits. Processing attempts to match misspelled words and out-of-vocabulary (OOV) words. The result vocabulary is maximum 100,000 words.

  <Note>  For a language other than English, use the applicable byte pair encoding (BPE) processor.</Note>
* **Byte pair encoding (BPE)** uses pre-trained BPE tokenization and embeddings. Each available pre-trained BPE model has different versions. The versions use the same token vectors, but have different vocabulary sizes:
  * `bpe_*_small` embeddings have up to 10,000 vocabulary tokens
  * `bpe_*_large` embeddings have up to 100,000 vocabulary tokens
  * `bpe_multi multilingual` embeddings have up to 320,000 vocabulary tokens
* **Custom** token embeddings, either word or BPE, that are based on the data provided during model training. This can be used if your content contains domain-specific vocabulary, or to train a model for a non-supported language.  This embeddings training is language agnostic, but Lucidworks recommends using custom BPE training for non-Latin languages or in multilingual scenarios.

  To train custom token embeddings, set TextProcessor to one of the following:

  * `word_custom` which trains word embeddings with up to 100,000 vocabulary size.
  * `bpe_custom` which trains BPE embeddings with up to 10,000 vocabulary size. This text processor learns a custom tokenization function over your data, so the default vocabulary size of 10,000 is sufficient in most cases.

## Models endpoint

The `/models` endpoint operations perform the following:

* `GET` returns a list of pre-trained and custom models.
* `POST` creates a custom model and starts a training job. The custom model cannot be modified after it is created.

### GET /models example

The request requires your unique `CUSTOMER_ID`. For more information, see [Credentials](/docs/lw-platform/lw-platform/authentication-api#fetch-the-access-token).

<CodeGroup>
  ```bash wrap Request theme={"dark"}
  curl --request GET \
    --url https://api.lucidworks.com/customers/CUSTOMER_ID/ai/models \
    --header 'Content-Type: application/json'
    --header 'Authorization: Bearer ACCESS_TOKEN'
  ```

  ```json wrap Response theme={"dark"}
  [
    {
      "id": "text-encoder",
      "category": "pre-trained (shared)",
      "modelType": "text-encoder",
      "description": "This is the model description.",
      "state": "AVAILABLE"
    },
    {
      "id": "multilinguallm",
      "category": "pre-trained (shared)",
      "modelType": "multilinguallm",
      "description": "This is the model description.",
      "state": "AVAILABLE"
    },
    {
      "id": "1af001c0-cabc-4430-b3b1-c1d8f632e87a",
      "name": "ecommerce custom model name",
      "modelType": "ecommerce-rnn",
      "category": "CUSTOM",
      "description": "Custom model tuned for ecommerce training",
      "region": "us-iowa",
      "trainingData": {
        "catalog": "gs://ml-platform-model-parameters-us-iowa/customer/data/index.parquet",
        "signals": "gs://ml-platform-model-parameters-us-iowa/customer/data/query.parquet"
      },
      "config": {
        "dataset_config": "mlp_ecommerce",
        "trainer_config": "mlp_ecommerce_rnn",
        "trainer_config/text_processor_config": "word_en",
        "trainer_config.encoder_config.rnn_names_list": [
          null
        ],
        "trainer_config.encoder_config.rnn_units_list": [
          null
        ],
        "trainer_config.trn_batch_size": 0,
        "trainer_config.num_epochs": 1,
        "trainer_config.monitor_patience": 8,
        "trainer_config.encoder_config.emb_spdp": 0.3,
        "trainer_config.encoder_config.emb_trainable": true
      },
      "state": "string",
      "trainingStarted": "2019-08-24T14:15:22Z",
      "trainingCompleted": "2019-08-24T14:15:22Z",
      "createdBy": "string",
      "deployments": [
        {}
      ]
    }
  ]
  ```
</CodeGroup>

### POST /models example

The request requires your unique `CUSTOMER_ID`. For more information, see [Credentials](/docs/lw-platform/lw-platform/authentication-api#fetch-the-access-token).

<CodeGroup>
  ```json wrap Request theme={"dark"}
  curl --request POST \
    --url https://api.lucidworks.com/customers/CUSTOMER_ID/ai/models \
    --header 'Content-Type: application/json' \
    --data '{
    "name": "ecommerce custom model name",
    "modelType": "ecommerce-rnn",
    "region": "us-iowa",
    "trainingData": {
      "catalog": "gs://ml-platform-model-parameters-us-iowa/customer/data/index.parquet",
      "signals": "gs://ml-platform-model-parameters-us-iowa/customer/data/query.parquet"
    },
   "config": {
      "dataset_config": "mlp_ecommerce",
      "trainer_config": "mlp_ecommerce_rnn",
      "trainer_config/text_processor_config": "word_en",
      "trainer_config.encoder_config.rnn_names_list": [
        "gru"
      ],
      "trainer_config.encoder_config.rnn_units_list": [
        128
      ],
      "trainer_config.trn_batch_size": 0,
      "trainer_config.num_epochs": 1,
      "trainer_config.monitor_patience": 8,
      "trainer_config.encoder_config.emb_spdp": 0.3,
      "trainer_config.encoder_config.emb_trainable": true
    },
    "trainingDataCredentials": {
      "serviceAccountKey": "string"
    }
  }
  ```

  ```json wrap Response theme={"dark"}
  {
    "id": "fb148491-b39e-46d1-af33-44cd964d8ee0",
    "name": "ecommerce custom model name",
    "modelType": "ecommerce-rnn",
    "category": "CUSTOM",
    "description": "Custom model tuned for ecommerce training",
    "region": "us-iowa",
    "trainingData": {
      "catalog": "gs://ml-platform-model-parameters-us-iowa/customer/data/index.parquet",
      "signals": "gs://ml-platform-model-parameters-us-iowa/customer/data/query.parquet"
    },
  "config": {
      "dataset_config": "mlp_ecommerce",
      "trainer_config": "mlp_ecommerce_rnn",
      "trainer_config/text_processor_config": "word_en",
      "trainer_config.encoder_config.rnn_names_list": [
        "gru"
      ],
      "trainer_config.encoder_config.rnn_units_list": [
        128
      ],
      "trainer_config.trn_batch_size": 0,
      "trainer_config.num_epochs": 1,
      "trainer_config.monitor_patience": 8,
      "trainer_config.encoder_config.emb_spdp": 0.3,
      "trainer_config.encoder_config.emb_trainable": true
    },
    "state": "string",
    "trainingStarted": "string",
    "trainingCompleted": "string",
    "createdBy": "string"
  }
  ```
</CodeGroup>

## Model ID endpoint

The `/modelId` endpoint operation performs the following:

* `GET` returns information about a specific model.

### GET /modelId example

The request requires your unique `CUSTOMER_ID` and the specific `MODEL_ID` to return. For more information about `CUSTOMER_ID`, see [Credentials](/docs/lw-platform/lw-platform/authentication-api#fetch-the-access-token).

<CodeGroup>
  ```bash Request theme={"dark"}
  curl --request GET \
    --url https://api.lucidworks.com/customers/CUSTOMER_ID/ai/models/MODEL_ID \
    --header 'Content-Type: application/json'
  ```

  ```json Response standard model theme={"dark"}
  {
    "id": "text-encoder",
    "modelType": "text-encoder",
    "description": "This is the model description.",
    "state": "AVAILABLE"
  }
  ```

  ```json Response custom model theme={"dark"}
  {
    "id": "441eb3be-7de6-470a-8141-e416a15c7db1",
    "name": "ecommerce custom model name",
    "modelType": "ecommerce-rnn",
    "category": "CUSTOM",
    "description": "Custom model tuned for ecommerce training",
    "region": "us-iowa",
    "vectorSize": 256,
    "trainingData": {
      "catalog": "gs://ml-platform-model-parameters-us-iowa/customer/data/index.parquet",
      "signals": "gs://ml-platform-model-parameters-us-iowa/customer/data/query.parquet"
    },
    "config": {
      "dataset_config": "mlp_ecommerce",
      "trainer_config": "mlp_ecommerce_rnn",
      "trainer_config.num_epochs": 1
    },
    "state": "AVAILABLE",
    "trainingStarted": "2023-06-14T15:28:40.201Z",
    "trainingCompleted": "2023-06-14T15:36:55.320Z",
    "trainingMetrics": {
      "summary": {
        "best_epoch": 1,
        "index_size": 3885,
        "vector_size": 256,
        "training_time": 45.730143308639526,
        "num_trn_queries": 17730,
        "num_val_queries": 1969,
        "num_unique_training_pairs": 41380
      },
      "epoch_metrics": {
        "hit": {
          "trn": {
            "1": [
              0.22955815134586086
            ],
            "3": [
              0.4154393092940579
            ],
            "5": [
              0.5073641442356526
            ],
            "10": [
              0.6140172676485526
            ]
          },
          "val": {
            "1": [
              0.21736922295581512
            ],
            "3": [
              0.4245810055865922
            ],
            "5": [
              0.510411376333164
            ],
            "10": [
              0.6069070594210259
            ]
          }
        },
      },
    },
    "deployments": [
      {
        "id": "441eb3be-7de6-470a-8141-e416a15c7db1",
        "region": "us-southcarolina",
        "state": "DEPLOYED"
      }
    ]
  }
  ```
</CodeGroup>

## Deployments endpoint

The `/deployments` endpoint operations perform the following:

* `GET` returns a list of custom model deployments. Pre-trained models are not returned in the response because they are deployed in all available regions.
* `POST` deploys a custom model.
* `DELETE` deletes a custom model deployment.

### GET /deployments example

The request requires your unique `CUSTOMER_ID`. For more information, see [Credentials](/docs/lw-platform/lw-platform/authentication-api#fetch-the-access-token).

<CodeGroup>
  ```bash wrap Request theme={"dark"}
  curl --request GET \
    --url https://api.lucidworks.com/customers/CUSTOMER_ID/ai/deployments \
    --header 'Content-Type: application/json'
  ```

  ```json wrap Response theme={"dark"}
  [
    {
      "id": "1af001c0-cabc-4430-b3b1-c1d8f632e87a",
      "modelId": "441eb3be-7de6-470a-8141-e416a15c7db1",
      "region": "us-southcarolina",
      "config": {
        "parameter_1": "value_1",
        "parameter_2": "value_2"
      },
      "minReplicas": 1,
      "maxReplicas": 1,
      "state": "DEPLOYED",
      "deployedAt": "2019-08-24T14:15:22Z",
      "createdBy": "string"
    },
    {
      "id": "6a092bd4-5098-466c-94aa-40bf68294303",
      "modelId": "441eb3be-7de6-470a-8141-e416a15c7db1",
      "region": "us-southcarolina",
      "minReplicas": 2,
      "maxReplicas": 4,
      "state": "DEPLOYED",
      "deployedAt": "2019-08-24T14:15:22Z",
      "createdBy": "string"
    }
  ]
  ```
</CodeGroup>

### POST /deployments example

The request requires your unique `CUSTOMER_ID`. For more information, see [Credentials](/docs/lw-platform/lw-platform/authentication-api#fetch-the-access-token).

<CodeGroup>
  ```json wrap Request theme={"dark"}
  curl --request POST \
    --url https://api.lucidworks.com/customers/CUSTOMER_ID/ai/deployments \
    --header 'Content-Type: application/json' \
    --data '{
    "modelId": "441eb3be-7de6-470a-8141-e416a15c7db1",
    "region": "us-southcarolina",
    "minReplicas": 2,
    "maxReplicas": 4,
    "config": {
      "parameter_1": "value_1",
      "parameter_2": "value_2"
    }
  }'
  ```

  ```json wrap Response theme={"dark"}
  {
    "id": "118109e5-7ec5-42bb-834d-e3cd41bba65f",
    "modelId": "441eb3be-7de6-470a-8141-e416a15c7db1",
    "region": "us-southcarolina",
    "config": {
      "parameter_1": "value_1",
      "parameter_2": "value_2"
    },
    "minReplicas": 2,
    "maxReplicas": 4,
    "state": "DEPLOYING",
    "deployedAt": "2019-08-24T14:15:22Z",
    "createdBy": "string"
  }
  ```
</CodeGroup>

### DELETE /deployments example

The request requires your unique `CUSTOMER_ID` and the specific `DEPLOYMENT_ID` for the model. For more information about `CUSTOMER_ID`, see [Credentials](/docs/lw-platform/lw-platform/authentication-api#fetch-the-access-token).

<CodeGroup>
  ```bash wrap Request theme={"dark"}
  curl --request DELETE \
    --url https://api.lucidworks.com/customers/CUSTOMER_ID/ai/deployments/DEPLOYMENT_ID \
    --header 'Content-Type: application/json'
  ```

  ```json wrap Response theme={"dark"}
  {
    "id": "441eb3be-7de6-470a-8141-e416a15c7db1",
    "modelId": "1af001c0-cabc-4430-b3b1-c1d8f632e87a",
    "region": "us-southcarolina",
    "config": {
      "parameter_1": "value_1",
      "parameter_2": "value_2"
    },
    "minReplicas": 2,
    "maxReplicas": 4,
    "state": "DELETING",
    "deployedAt": "2019-08-24T14:15:22Z",
    "createdBy": "string"
  }
  ```
</CodeGroup>

## Model ID Deployments endpoint

The `/modelId/deployments` endpoint operation performs the following:

* `GET` returns a list of custom model deployments.

### GET /modelId/deployments example

The request requires your unique `CUSTOMER_ID` and the specific `MODEL_ID` to return. For more information about `CUSTOMER_ID`, see [Credentials](/docs/lw-platform/lw-platform/authentication-api#fetch-the-access-token).

<CodeGroup>
  ```bash wrap Request theme={"dark"}
  curl --request GET \
    --url https://api.lucidworks.com/customers/CUSTOMER_ID/ai/models/MODEL_ID/deployments \
    --header 'Content-Type: application/json'
  ```

  ```json wrap Response theme={"dark"}
  [
    {
      "id": "441eb3be-7de6-470a-8141-e416a15c7db1",
      "modelId": "6a092bd4-5098-466c-94aa-40bf6829430",
      "region": "us-southcarolina",
      "config": {
        "parameter_1": "value_1",
        "parameter_2": "value_2"
      },
      "minReplicas": 1,
      "maxReplicas": 1,
      "state": "DEPLOYED",
      "deployedAt": "2019-08-24T14:15:22Z",
      "createdBy": "string"
    },
    {
      "id": "118109e5-7ec5-42bb-834d-e3cd41bba65f",
      "modelId": "d439fd0d-1edf-4982-b00c-51c94a5c0490",
      "region": "us-southcarolina",
      "config": {
        "parameter_1": "value_1",
        "parameter_2": "value_2"
      },
      "minReplicas": 2,
      "maxReplicas": 4,
      "state": "DEPLOYED",
      "deployedAt": "2019-08-24T14:15:22Z",
      "createdBy": "string"
    }
  ]
  ```
</CodeGroup>
