> ## Documentation Index
> Fetch the complete documentation index at: https://doc.lucidworks.com/llms.txt
> Use this file to discover all available pages before exploring further.

# V1 and V2 Connectors

export const LwTemplate = ({title = "Key questions to get you started", icon = "sparkles", cta = "Powered by Agent Studio", linkHref = "https://lucidworks.com/demo/?utm_source=docs&utm_medium=referral&utm_campaign=docs_cta_ai"}) => {
  const [isLoaded, setIsLoaded] = useState(false);
  useEffect(() => {
    const timer = setTimeout(() => {
      setIsLoaded(true);
    }, 500);
    return () => clearTimeout(timer);
  }, []);
  return <div className="lw-template-container">
      <Card title={title} icon={icon}>
        {isLoaded && <span dangerouslySetInnerHTML={{
    __html: `<lw-template id="a029c1a9-28be-427e-b0e1-5d918920246a"></lw-template
            >`
  }} />}
        <Link href={linkHref} className="agent-studio-link text-left text-gray-600 gap-2 dark:text-gray-400 text-sm font-medium flex flex-row items-center hover:text-primary dark:hover:text-primary-light group-hover:text-primary group-hover:dark:text-primary-light">Powered by Lucidworks Agent Studio</Link>
      </Card>
    </div>;
};

[localhost link]: http://localhost:3000/docs/fusion-connectors/concepts/v1-v2-connectors

[mintlify link]: https://doc.lucidworks.com/docs/fusion-connectors/concepts/v1-v2-connectors

[old doc.lw link]: https://doc.lucidworks.com/fusion-connectors/331vyj

There are two types of frameworks for Fusion connectors: V1 (also referred to as classic or built-in) and V2 (also known as plugin).

<LwTemplate />

## V1 (Classic) connectors

V1 connectors are developed with a general-purpose crawler framework called Anda, created by Lucidworks. Anda helps simplify and streamline crawler development, reducing the task of developing a new crawler to gain access to your data.

In Fusion 5, V1 connectors are included in the Fusion image. You can install or update a connector at any time through the UI (under Datasources).

<Accordion title="Install or update a connector - Fusion 5">
  When you create a new datasource that requires an uninstalled connector, Fusion releases 5.2 and later automatically download and install the connector using the [Datasources dropdown](#install-a-connector-using-the-datasources-dropdown). You can also update the connector using the [Blob Store UI](#install-or-update-a-connector-using-the-blob-store-ui) or via the [Connector API](#install-or-update-a-connector-using-the-connector-api).

  ## Install a connector using the Datasources dropdown

  1. In your Fusion app, navigate to **Indexing** > **Datasources**.
  2. Click **Add**.
  3. In the list of connectors, scroll down to the connectors marked **Not Installed** and select the one you want to install.\
     Fusion automatically downloads it and moves it to the list of installed connectors.

  After you install a connector, you can Configure a New Datasource.

  <Tip>You can view and download all current and previous V2 connector releases at [Download Connectors](/docs/fusion-connectors/downloads/v2-connectors-downloads).</Tip>

  ## Install or update a connector using the Blob Store UI

  1. Download the connector zip file from [Download V2 connectors](/docs/fusion-connectors/downloads/v2-connectors-downloads).
     <Note>   Do not expand the archive; Fusion consumes it as-is.</Note>
  2. In your Fusion app, navigate to **System** > **Blobs**.
  3. Click **Add**.
  4. Select **Connector Plugin**.
       <img src="https://mintcdn.com/lucidworks/5yWZ-KtZuBe4Y_Fg/assets/images/4.0/blobs-add-connector.png?fit=max&auto=format&n=5yWZ-KtZuBe4Y_Fg&q=85&s=c021b250a83011d5f0af0dacd2d9a2fb" alt="Add a connector" width="2448" height="1042" data-path="assets/images/4.0/blobs-add-connector.png" />
     The "New Connector Plugin Upload" panel appears.
  5. Click **Choose File** and select the downloaded zip file from your file system.
       <img src="https://mintcdn.com/lucidworks/5yWZ-KtZuBe4Y_Fg/assets/images/4.0/blobs-connector-upload.png?fit=max&auto=format&n=5yWZ-KtZuBe4Y_Fg&q=85&s=8088bf62ac29e56ec5d67192a9a5b474" alt="Upload a connector" width="2454" height="1029" data-path="assets/images/4.0/blobs-connector-upload.png" />
  6. Click **Upload**.
     The new connector’s blob manifest appears.
       <img src="https://mintcdn.com/lucidworks/5yWZ-KtZuBe4Y_Fg/assets/images/4.0/blobs-edit.png?fit=max&auto=format&n=5yWZ-KtZuBe4Y_Fg&q=85&s=3e9922a16d5c702f2efa9cb8be63cb0e" alt="Uploaded connector" width="2454" height="1030" data-path="assets/images/4.0/blobs-edit.png" />
     From this screen you can also delete or replace the connector.

  <Warning>
    Wait several minutes for the connector to finish uploading to the blob store before installing the connector using the [Datasources dropdown](#install-a-connector-using-the-datasources-dropdown).
  </Warning>

  ## Install or update a connector using the Connector API

  1. Download the connector zip file from [Download V2 connectors](/docs/fusion-connectors/downloads/v2-connectors-downloads).

     <Note>   Do not expand the archive; Fusion consumes it as-is.</Note>
  2. Upload the connector zip file to Fusion’s plugins.
     Specify a `pluginId` as in this example:
     ```
     curl -H 'content-type:application/zip' -u USERNAME:PASSWORD -X PUT 'https://FUSION_HOST:FUSION_PORT/api/connectors/plugins?id=lucidworks.{pluginId}' --data-binary @{plugin_path}.zip
     ```
     Fusion automatically publishes the event to the cluster, and the listeners perform the connector installation process on each node.
       <Tip>
         If the `pluginId` is identical to an existing one, the old connector will be uninstalled and the new connector will be installed in its place. To get the list of existing plugin IDs, run: `curl -u USERNAME:PASSWORD https://FUSION_HOST:FUSION_PORT/api/connectors/plugins`
       </Tip>
  3. Look in `https://FUSION_HOST:FUSION_PORT/apps/connectors/plugins/` to verify the new connector is installed.

  ## Reinstall a connector

  To reinstall a connector for any reason, first delete the connector then use the preceding steps to install it again. This may take a few minutes to complete depending on how quickly the pods are deleted and recreated.
</Accordion>

## V2 (Plugin) connectors

Fusion 4.2 supports V2 connectors, which utilize a Java SDK framework.
Fusion V2 connectors are installed via Datasources in the UI or by using the [Connector Plugins Repository API](/api-reference/connector-plugins-repository-api/list-all-plugins-in-repository).
In addition to the features and benefits provided by V1 connectors, V2 connectors offer:

* Updates and improvements delivered separately from Fusion releases. Update a V2 connector by installing the [latest plugin version](/docs/fusion-connectors/downloads/v2-connectors-downloads).
* [Security Access-control Lists (ACL)](/docs/fusion-connectors/connectors/ad-acl-ldap) which are separate from content.
* Improved scalability. Jobs can be scaled by simply adding instances of the connector. The fetching process supports distributed fetching, allowing many instances to contribute to the same job.
* The ability to develop a custom connector.

<Accordion title="Develop a Custom Connector">
  <a name="java-sdk-configuration" />

  ## Java SDK configuration

  To build a valid connector configuration, you must:

  * Define an interface.
  * Extend `ConnectorConfig`.
  * Apply a few annotations.
  * Define connector methods and annotations.

  All methods that are annotated with `@Property` are considered to be configuration properties.
  For example, `@Property() String name();` results in a String property called `name`.
  This property would then be present in the generated schema.

  Here is an example of the most basic configuration, along with required annotations:

  ```java theme={"dark"}
  @RootSchema(
      title = "My Connector",
      description = "My Connector description",
      category = "My Category"
  )
  public interface MyConfig extends ConnectorConfig<MyConfig.Properties> {
    @Property(
        title = "Properties",
        required = true
    )
    public Properties properties();
    /**
      * Connector specific settings
      */
    interface Properties extends FetcherProperties {
      @Property(
          title = "My custom property",
          description = "My custom property description"
      )
      public Integer myCustomProperty();
    }
  }
  ```

  The metadata defined by `@RootSchema` is used by Fusion when showing the list of available connectors.
  The `ConnectorConfig` base interface represents common, top-level settings required by all connectors.
  The `type` parameter of the `ConnectorConfig` class indicates the interface to use for custom properties.

  Once a connector configuration has been defined, it can be associated with the `ConnectorPlugin` class.
  From that point, the framework takes care of providing the configuration instances to your connector.
  It also generates the schema, and sends it along to Fusion when it connects to Fusion.

  Schema metadata can be applied to properties using additional annotations. For example, applying limits to the min/max length of a string, or describing the types of items in an array.

  Nested schema metadata can also be applied to a single field by using "stacked" schema based annotations:

  ```java theme={"dark"}
  interface MySetConfig extends Model {
      @SchemaAnnotations.Property(title = "My Set")
      @SchemaAnnotations.ArraySchema(defaultValue = "[\"a\"]")
      @SchemaAnnotations.StringSchema(defaultValue = "some-set-value", minLength = 1, maxLength = 1)
      Set<String> mySet();
    }
  ```

  ## Plugin client

  The Fusion connector plugin client provides a wrapper for the Fusion Java plugin-sdk so that plugins do not need to directly talk with gRPC code.
  Instead, they can use high-level interfaces and base classes, like Connector and Fetcher.

  The plugin client also provides a standalone "runner" that can host a plugin that was built from the Fusion Java Connector SDK.
  It does this by loading the plugin zip file, then calling on the wrapper to provide the framework interactions.

  ### Standalone Connector Plugin Application

  The second goal of the plugin-client is to allow Java SDK plugins to run remotely.
  The instructions for deploying a connector using this method are provided below.

  #### Locating the UberJar

  The uberjar is located in this location in the Fusion file system:

  ```bash wrap theme={"dark"}
  $FUSION_HOME/apps/connectors/connectors-rpc/client/connector-plugin-client-<version>-uberjar.jar
  ```

  where `$FUSION_HOME` is your Fusion installation directory and `<version>` is your Fusion version number.

  #### Starting the Host

  To start the host app, you need a Fusion SDK-based connector, built into the standard packaging format as a `.zip` file. This `zip` must contain only one connector plugin.

  Here is an example of how to start up using the web connector:

  ```bash wrap theme={"dark"}
  java -jar $FUSION_HOME/apps/connectors/connectors-rpc/client/connector-plugin-client-<version>-uberjar.jar fusion-connectors/build/plugins/connector-web-4.0.0-SNAPSHOT.zip
  ```

  To run the client with remote debugging enabled:

  ```bash wrap theme={"dark"}
  java -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5010 -jar $FUSION_HOME/apps/connectors/connectors-rpc/client/connector-plugin-client-<version>-uberjar.jar fusion-connectors/build/plugins/connector-web-4.0.0-SNAPSHOT.zip
  ```

  ## Java SDK security

  ### Fusion Connector Plugin Client

  The Fusion connector plugin client provides a wrapper for the Fusion Java plugin-sdk so that plugins do not need to directly talk with gRPC code.
  Instead, they can use high-level interfaces and base classes, like Connector and Fetcher.

  The plugin client also provides a standalone "runner" that can host a plugin that was built from the Fusion Java Connector SDK.
  It does this by loading the plugin zip file, then calling on the wrapper to provide the framework interactions.

  ### Standalone Connector Plugin Application

  The second goal of the plugin-client is to allow Java SDK plugins to run remotely.
  The instructions for deploying a connector using this method are provided below.

  #### Locating the UberJar

  The uberjar is located in this location in the Fusion file system:

  ```bash wrap theme={"dark"}
  $FUSION_HOME/apps/connectors/connectors-rpc/client/connector-plugin-client-<version>-uberjar.jar
  ```

  where `$FUSION_HOME` is your Fusion installation directory and `<version>` is your Fusion version number.

  #### Starting the Host

  To start the host app, you need a Fusion SDK-based connector, built into the standard packaging format as a `.zip` file. This `zip` must contain only one connector plugin.

  Here is an example of how to start up using the web connector:

  ```bash wrap theme={"dark"}
  java -jar $FUSION_HOME/apps/connectors/connectors-rpc/client/connector-plugin-client-<version>-uberjar.jar fusion-connectors/build/plugins/connector-web-4.0.0-SNAPSHOT.zip
  ```

  To run the client with remote debugging enabled:

  ```bash wrap theme={"dark"}
  java -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5010 -jar $FUSION_HOME/apps/connectors/connectors-rpc/client/connector-plugin-client-<version>-uberjar.jar fusion-connectors/build/plugins/connector-web-4.0.0-SNAPSHOT.zip
  ```

  ## Simple Connector

  ### Fusion Connector Plugin Client

  The Fusion connector plugin client provides a wrapper for the Fusion Java plugin-sdk so that plugins do not need to directly talk with gRPC code.
  Instead, they can use high-level interfaces and base classes, like Connector and Fetcher.

  The plugin client also provides a standalone "runner" that can host a plugin that was built from the Fusion Java Connector SDK.
  It does this by loading the plugin zip file, then calling on the wrapper to provide the framework interactions.

  ### Standalone Connector Plugin Application

  The second goal of the plugin-client is to allow Java SDK plugins to run remotely.
  The instructions for deploying a connector using this method are provided below.

  #### Locating the UberJar

  The uberjar is located in this location in the Fusion file system:

  ```bash wrap theme={"dark"}
  $FUSION_HOME/apps/connectors/connectors-rpc/client/connector-plugin-client-<version>-uberjar.jar
  ```

  where `$FUSION_HOME` is your Fusion installation directory and `<version>` is your Fusion version number.

  #### Starting the Host

  To start the host app, you need a Fusion SDK-based connector, built into the standard packaging format as a `.zip` file. This `zip` must contain only one connector plugin.

  Here is an example of how to start up using the web connector:

  ```bash wrap theme={"dark"}
  java -jar $FUSION_HOME/apps/connectors/connectors-rpc/client/connector-plugin-client-<version>-uberjar.jar fusion-connectors/build/plugins/connector-web-4.0.0-SNAPSHOT.zip
  ```

  To run the client with remote debugging enabled:

  ```bash wrap theme={"dark"}
  java -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5010 -jar $FUSION_HOME/apps/connectors/connectors-rpc/client/connector-plugin-client-<version>-uberjar.jar fusion-connectors/build/plugins/connector-web-4.0.0-SNAPSHOT.zip
  ```
</Accordion>

### Remote V2 connectors

All V2 connectors can be hosted within Fusion or [run remotely](/docs/fusion-connectors/developers/remote-v2-connectors) unless otherwise stated in the specific connector documentation.

* **Hosted connectors** run within the Fusion cluster and are managed by Fusion.
* **Remote connectors** run on your own infrastructure and communicate securely with Fusion using gRPC over HTTP/2. This enables indexing of data sources behind firewalls or in private networks.
* Remote connectors are ideal for data behind firewalls, security policies restricting cloud access, or compliance requirements mandating on-premises data processing.

### gRPC framework

V2 connectors use Google’s gRPC framework as the underlying client/server technology. This offers:

* Increased flexibility in the way services and their methods are defined.
* HTTP/2 based transport.
* Efficient serialization format for data handling (protocol buffers).
* Bi-directional/multiplexed streams.
* As of Fusion 5.6.1, V2 connectors using a gRPC backend can be [run remotely](/docs/fusion-connectors/developers/remote-v2-connectors).

## Learn more

<Accordion title="Add Tesseract Optical Character Recognition to Fusion Connectors">
  ## Tesseract Optical Character Recognition (OCR) solution

  The Tesseract OCR is an open source solution that can be added to interact with Fusion connectors in releases 5.2 and later. The example in this topic represents a classic REST service that interfaces with V1 connectors including functions such as file upload and web crawl.

  <Check>To set up OCR for V2 connectors, you must repeat this process for each individual Docker image related to the connector.</Check>

  ## Prerequisites

  The following must be established before adding the Tesseract OCR solution:

  * A local environment for installing and managing Fusion 5 that includes Google Cloud Tools and other required components.
  * The Docker daemon must be running on MacOS and a Docker account for hub.docker.com.
  * Fusion 5 installed and deployed.

  ## Add Tesseract OCR solution

  1. Execute the following to create a Docker file:
     ```bash theme={"dark"}
     FROM lucidworks/classic-rest-service:5.2.1
     USER root
     RUN apt-get install -y tesseract-ocr
     USER 8764
     ```
     The file:
     * Directs Kubernetes Helm to use an existing image with the `<repo>/<image>:<tag>` format as the basis for the new image.
     * Switches to the `root` user to perform the Tesseract install.
     * Switches back to user `8764` because the classic REST service pod in Kubernetes is not permitted to run as `root`.
  2. Build the new Docker image in the same directory as your `Dockerfile`. Enter values that reflect your image and directory. For example: `docker build -t jdoe/lucidworks/classic-rest-service-ocr:1.0.1`\
     In Fusion 5, the dependency check in Fusion must be included in any custom operation. You must add the dependency image where the custom connector image is stored (at the same level and in the same repository). The sample commands are:
     ```bash theme={"dark"}
     docker pull lucidworks/check-fusion-dependency:v1.2.0
     docker tag lucidworks/check-fusion-dependency:v1.2.0 jdoe/check-fusion-dependency:v1.2.0
     docker push jdoe/check-fusion-dependency:v1.2.0
     ```
     Access the Docker hub to view the image-related information such as name, tag, digest, and operating system.
  3. Open the `fusion_values.yaml` file and replace the existing connector image with the custom version. For example:
     ```yaml theme={"dark"}
     classic-rest-service:
        image:
        repository: jdoe
        name: classic-rest-service-ocr
        tag: 1.0.0
        nodeSelector:
            cloud.google.com/gke-nodepool: default-pool
     ```
  4. Execute the standard process to upgrade (rebuild) the Fusion cluster.
  5. Access the Tesseract pod using ssh and run `tesseract -v` to verify Tesseract is installed and working correctly. The result is similar to the following:
     ```bash theme={"dark"}
     <<K9s-Shell>> Pod: jdoe-poc/jdoe-classic-rest-service-0 | Container: classic-rest-service
     fusion@jdoe-poc-classic-rest-service-0:/$ tesseract -v
     tesseract 4.0.0
     leptonica-1.76.0
         libgif 5.1.4 : libjpeg 6b (libjpeg-turbo 1.5.2) : libpng 1.6.36 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
     Found AVX2
     Found AVX
     Found SSE
     ```
  6. Access each Fusion parser used for a datasource that performs OCR and select the following items:
     * **Apache Tika**
     * **Include images**
  7. Scan one of the following files to test the OCR function:
     * A `.pdf` file, that may contain an underlying `.tiff` file
     * A `.jpeg` file
     * A `.gif` file
  8. Verify the parser correctly extracts the information, which includes the `body_t` field.
</Accordion>
