> ## Documentation Index
> Fetch the complete documentation index at: https://doc.lucidworks.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Index Stages API

export const LwTemplate = ({title = "Key questions to get you started", icon = "sparkles", cta = "Powered by Agent Studio", linkHref = "https://lucidworks.com/demo/?utm_source=docs&utm_medium=referral&utm_campaign=docs_cta_ai"}) => {
  const [isLoaded, setIsLoaded] = useState(false);
  useEffect(() => {
    const timer = setTimeout(() => {
      setIsLoaded(true);
    }, 500);
    return () => clearTimeout(timer);
  }, []);
  return <div className="lw-template-container">
      <Card title={title} icon={icon}>
        {isLoaded && <span dangerouslySetInnerHTML={{
    __html: `<lw-template id="a029c1a9-28be-427e-b0e1-5d918920246a"></lw-template
            >`
  }} />}
        <Link href={linkHref} className="agent-studio-link text-left text-gray-600 gap-2 dark:text-gray-400 text-sm font-medium flex flex-row items-center hover:text-primary dark:hover:text-primary-light group-hover:text-primary group-hover:dark:text-primary-light">Powered by Lucidworks Agent Studio</Link>
      </Card>
    </div>;
};

[localhost link]: http://localhost:3000/docs/4/fusion-server/reference/api/indexing/index-stages-api

[mintlify link]: https://doc.lucidworks.com/docs/4/fusion-server/reference/api/indexing/index-stages-api

[old doc.lw link]: https://doc.lucidworks.com/fusion/5.9/349

**API Objective:**
Each step of a pipeline.

The Index Stages API provides endpoints to:

* List index stage configuration properties
* Manage index stage instances
* Test processing on a set of queries

An index pipeline is comprised of index stages.
Each index stage has a name and a type.
The name identifies the stage instance, and the type identifies its class.
Every stage type has a number of properties, which can be configured for a particular index stage instance.
See the section [Index Pipeline Stages](/docs/4/fusion-server/concepts/indexing/datasources/index-pipeline-stages) for a taxonomy of index stage types.

<LwTemplate />

## Examples

*See all defined index pipeline stages, regardless of type:*

**REQUEST**

```bash wrap  theme={"dark"}
curl -u USERNAME:PASSWORD https://FUSION_HOST:8764/api/index-stages/instances
```

**RESPONSE**

```json wrap  theme={"dark"}
[{
  "type" : "tika-parser",
  "id" : "conn_tika",
  "includeImages" : true,
  "flattenCompound" : false,
  "addFailedDocs" : true,
  "addOriginalContent" : true,
  "skip" : false
},
{
  "type" : "index-logging",
  "id" : "detailed-logging",
  "detailed" : true,
  "skip" : false,
  "label" : "detailed-index-logging",
}]
```

*See details of an index-stage named 'conn\_tika':*

**REQUEST**

```bash wrap  theme={"dark"}
curl -u USERNAME:PASSWORD https://FUSION_HOST:8764/api/index-stages/instances/conn_tika
```

**RESPONSE**

```json wrap  theme={"dark"}
{
  "type" : "tika-parser",
  "id" : "conn_tika",
  "includeImages" : true,
  "flattenCompound" : false,
  "addFailedDocs" : true,
  "addOriginalContent" : true,
  "skip" : false
}
```

*Create an index stage:*

**REQUEST**

```bash wrap  theme={"dark"}
curl -u USERNAME:PASSWORD -X POST -H 'Content-type: application/json' -d '{"id": "storagesize-regex-extractor", "type":"regex-extractor", "rules": [{"source":["name"], "target":"storage_size_ss", "pattern":"(\\d{1,20}\\s{0,3}(GB|MB|TB|KB|mb|gb|tb|kb))", "annotateAs":"storage_size"}]}' https://FUSION_HOST:8764/api/index-stages/instances
```

**RESPONSE**

```json wrap  theme={"dark"}
{
  "type" : "regex-extractor",
  "id" : "storagesize-regex-extractor",
  "rules" : [ {
    "source" : [ "name" ],
    "target" : "storage_size_ss",
    "pattern" : "(\\d{1,20}\\s{0,3}(GB|MB|TB|KB|mb|gb|tb|kb))",
    "annotateAs" : "storage_size"
  } ],
  "skip" : false
}
```

*Delete an index stage:*

**REQUEST**

```bash wrap  theme={"dark"}
curl -u USERNAME:PASSWORD -X DELETE https://FUSION_HOST:8764/api/index-stages/instances/storagesize-regex-extractor
```

No response is returned. To check that the stage is no longer defined, list all index stage instances.

*Send a document through the index stage named 'conn\_tika':*

**REQUEST**

```bash wrap  theme={"dark"}
curl -u USERNAME:PASSWORD -X POST -H "Content-Type: application/json" -d '[{"id": "myDoc4","fields": [{"name":"title", "value": "Another little document document"},{"name":"body", "value": "This is a simple document."}]}]' https://FUSION_HOST:8764/api/index-stages/instances/conn_tika/docs/test
```

**RESPONSE**

```json wrap  expandable  theme={"dark"}
[ {
  "id" : "7b8a1d5b-9e42-40eb-8059-5804c4b4fc6b",
  "fields" : [ {
    "name" : "id",
    "value" : "myDoc4",
    "metadata" : { },
    "annotations" : [ ]
  }, {
    "name" : "parsing_time",
    "value" : [ "java.lang.Long", 0 ],
    "metadata" : { },
    "annotations" : [ ]
  }, {
    "name" : "parsing",
    "value" : "no_raw_data",
    "metadata" : {
      "creator" : "tika-parser"
    },
    "annotations" : [ ]
  }, {
    "name" : "fields",
    "value" : [ "java.util.ArrayList", [ {
      "name" : "title",
      "value" : "Another little document document"
    }, {
      "name" : "body",
      "value" : "This is a simple document."
    } ] ],
    "metadata" : { },
    "annotations" : [ ]
  } ],
  "metadata" : { },
  "commands" : [ ]
} ]
```

*View the configuration properties for index stage type "regex-extractor":*

**REQUEST**

```bash wrap  theme={"dark"}
curl -u USERNAME:PASSWORD https://FUSION_HOST:8764/api/index-stages/schema/regex-extractor
```

**RESPONSE**

```json wrap  theme={"dark"}
{
  "type" : "object",
  "title" : "Regex Field Extraction",
  "description" : "This stage allows you to extract entities using regular expressions",
  "properties" : {
    "rules" : {
      "type" : "array",
      "title" : "Regex Rules",
      "items" : {
        "type" : "object",
        "required" : [ "pattern" ],
        "properties" : {
          "source" : {
            "type" : "array",
            "title" : "Source Fields",
            "items" : {
              "type" : "string"
            }
          },
          "target" : {
            "type" : "string",
            "title" : "Target Field"
          },
          "pattern" : {
            "type" : "string",
            "title" : "Regex Pattern",
            "format" : "regex"
          },
          "annotateAs" : {
            "type" : "string",
            "title" : "Annotation Name"
          }
        }
      }
    }
  }
}
```
