Fusion Microservices

Fusion is comprised of microservices that drive features and functionality within a deployment. The services running in your deployment depend on the version of Fusion and the features you have enabled.

Get deployment details

You can view your deployment details using kubectl.

First, verify that you have access to your cluster, replacing the example values with your parameters. This example is for a Fusion instance deployed using GKE.
```
gcloud container clusters get-credentials EXAMPLE-CLUSTER --region EXAMPLE-REGION --project EXAMPLE-PROJECT
```
Get the ports and services:
```
kubectl get svc -n EXAMPLE-NAME
```

Get the StatefulSets:

kubectl get statefulsets -n EXAMPLE-NAME

Get the deployments:
```
kubectl get deploy -n EXAMPLE-NAME
```

Overview

The table below lists the Fusion microservices deployed by the Helm chart. It also include Kubernetes services that manage traffic to microservices.

Fusion is a complex distributed application composed of many stateful and stateless services designed to support demanding search-oriented workloads at high scale.

For Docker image versions associated with microservices, see the list of Docker images and versions for each Fusion release.

Microservices

Below is a list of microservices used in Fusion 5.9.x.

Microservice	Required for Fusion	Protocol	Deployment or StatefulSet	Node Pool Assignment	Autoscaling Supported	Description
`admin`	Yes	REST/HTTP	Deployment	system	Not required. Minimum of 1, but 2 pods are recommended for high availability.	Exposes endpoints for admin tasks, such as creating applications and running jobs.
`admin-ui`	No	Web	Deployment	system	Not required. One pod is enough for most clusters.	Serves static Web assets for the admin UI.
`apps-manager`	Yes	REST/HTTP	Deployment	analytics or system	Not required. One pod is enough for most clusters.	Tracks Fusion entitlement license and consumption.
`async-parser`	No	REST/HTTP	Deployment	system or analytics	Yes (CPU or custom metric).	Manages parsers and sends documents to be parsed asynchronously.
`auth-ui`	No	Web	Deployment	system	Not required. One pod is enough for most clusters.	Serves static Web assets for the login form.
`connector-plugin-<connector_plugin>`	No	HTTP/TCP	Deployment	analytics or system	Yes (CPU or custom metric).	Deployment for each connector plugin type. Note: There is a base deployment, connector-plugin with 0 replicas. This is used as a deployment template for each connector plugin type. It should not be deleted or scaled.
`connectors`	No	REST/HTTP	Deployment	analytics or system	Not required. One pod is enough for most clusters.	Routes REST API requests to connectors-classic and connectors-rpc.
`connectors-backend`	No	gRPC	Deployment	analytics or system	Yes (CPU or custom metric).	gRPC service for managing SDK-based connector plugins.
`connectors-classic`	No	REST/HTTP	StatefulSet	analytics or system	Not supported.	REST service for supporting non-RPC connector plugins. This microservice was previously named classic-rest-service.
`{fusion-namespace}-argo`	No	HTTP	Deployment	system	Yes (CPU or custom metric).	Orchestrates parallel jobs on Kubernetes.
`{fusion-namespace}-argo-argo-ui`	No	Web	Deployment	system	Not required. One pod is enough for most clusters.	Stores logs and prior Argo workflow runs.
`{fusion-namespace}-kafka`	Yes	HTTP	StatefulSet	system	Required. One pod is enough for most clusters.	Contains incoming data for Solr.
`{fusion-namespace}-kafka-headless`	Yes	HTTP	StatefulSet	system	Required. One pod is enough for most clusters.	Contains incoming data for Solr.
`{fusion-namespace}-ml-model-service-ambassador`	No	Web	Deployment	system	Not required. Minimum of 1, but 2 pods are recommended for high availability.	Load balancing and proxy for Seldon Core deployments.
`{fusion-namespace}-ml-model-service-mysql`	No	Web	Deployment	system	Not required. Minimum of 1, but 2 pods are recommended for high availability.	Handles metadata for Milvus service.
`{fusion-namespace}-solr-headless`	Yes	HTTP	StatefulSet	At least 3 nodes in search, 2 in analytics, and 2 in system	Yes (CPU or custom metric).	Search engine.
`{fusion-namespace}-solr-svc`	Yes	HTTP	StatefulSet	At least 3 nodes in search, 2 in analytics, and 2 in system	Yes (CPU or custom metric).	Search engine.
`{fusion-namespace}-zookeeper`	Yes	TCP	StatefulSet	system	Not required. You need to run 1, 3, or 5 ZooKeeper pods to keep quorum. Do not use HPA for scaling ZooKeeper.	Stores centralized configuration and performs distributed coordination tasks.
`{fusion-namespace}-zookeeper-headless`	Yes	TCP	StatefulSet	system	No. You need to run 1, 3, or 5 ZooKeeper pods to keep quorum. Do not use HPA for scaling ZooKeeper.	Stores centralized configuration and performs distributed coordination tasks.
`indexing`	Yes	REST/HTTP	Deployment	search or analytics depending on write-volume	Yes (CPU or custom metric).	Processes indexing requests.
`insights`	No	Web	Deployment	system	Not required. One pod is enough for most clusters.	Serves the App Insights UI.
`job-config`	Yes	REST/HTTP	Deployment	system	Yes, but not usually required. One pod is enough for most clusters.	Manages job configurations and histories. Added in Fusion 5.9.11.
`job-launcher`	No	REST/HTTP	Deployment	analytics	Not required. One pod is enough for most clusters.	Configures and launches the Spark driver pod for running Spark jobs.
`job-rest-server`	No	REST/HTTP	Deployment	analytics	Not required. One pod is enough for most clusters.	Performs admin tasks for creating and running Spark jobs.
`kuberay-operator`	No	HTTP	Deployment	-	Not required. One pod is enough for most clusters.	Manages Ray deployments and jobs (in Fusion 5.9.12 and later).
`milvus`	No	REST/HTTP	Deployment	analytics or system	Not required. One pod is enough for most clusters.	Dense Vector Search Engine for ML models active.
`ml-model-grpc`	No	REST/HTTP and gRPC	Deployment	search	Yes (CPU or custom metric).	Exposes gRPC endpoints for generating predictions from ML models.
`ml-model-service`	No	REST/HTTP and gRPC	Deployment	search	Yes (CPU or custom metric).	Exposes gRPC endpoints for generating predictions from ML models.
`pm-ui`	No	Web	Deployment	system	Not required. One pod is enough for most clusters.	Serves static Web assets for the Predictive Merchandiser app.
`proxy / api-gateway`	Yes	HTTP	Deployment	search	Not required. Minimum of 1, but 2 pods are recommended for high availability.	Performs authentication, authorization, and traffic routing.
`query`	Yes	REST/HTTP	Deployment	search	Yes (CPU or custom metric).	Processes query requests.
`rules-ui`	No	Web	Deployment	system	Not required. One pod is enough for most clusters.	Serves static Web assets for the Rules UI.
`seldon-webhook-service`	No	Web	Deployment	system	Not required. One pod is enough for most clusters.	Maintains Seldon Core deployments for ML model serving active.
`templating`	No	Web	Deployment	system	Not required. One pod is enough for most clusters.	Retrieves and renders Predictive Merchandiser templates.
`webapps`	No	REST/HTTP	Deployment	system	Not required. One pod is enough for most clusters.	Serves App Studio-based Web apps.

Ports used by Fusion

Below you will find the list of pod ports for intra-cluster communications.

Service	Port
`admin`	8765
`admin-ui`	8080
`apps-manager`	9025
`async-parsing`	9005
`argo-argo-ui`	2746
`auth-ui`	8080
`connector-plugin`	9020, 5701
`connectors`	9010
`connectors-backend`	8771
`connectors-classic`	9000
`{fusion-namespace}-argo-argo-ui`	2746
`{fusion-namespace}-kafka`	9092, 9093
`{fusion-namespace}-kafka-headless`	9092, 9093
`{fusion-namespace}-ml-model-service-ambassador`	80, 443
`{fusion-namespace}-ml-model-service-mysql`	3306
`{fusion-namespace}-reverse-search-headless`	8983
`{fusion-namespace}-reverse-search-svc`	8983
`{fusion-namespace}-solr-headless`	8983
`{fusion-namespace}-solr-svc`	8983
`{fusion-namespace}-zookeeper`	2181, 2281
`{fusion-namespace}-zookeeper-headless`	2181, 3888, 2888, 2281
`indexing`	8765
`insights`	8080
`job-launcher`	8083
`job-rest-server`	8081
`milvus`	19530, 19121
`ml-model-grpc`	6565
`ml-model-service`	8086
`pm-ui`	8080
`proxy`	6764
`query`	8787
`rules-ui`	8080
`seldon-webhook-service`	443
`templating`	5250
`webapps`	8780

Standardized component logging using structured JSON format

In Fusion 5.9.14 and later, some Fusion services can log in structured JSON format instead of plaintext (Log4j-style) output. This improves compatibility with monitoring tools and log aggregation systems by making logs easier to parse, filter, and analyze. JSON logging is supported for the following services:

admin
apps-manager
connectors
connectors-backend
connectors-classic
distributed-compute (job-launcher, job-rest-server)
indexing
job-config
ml-model-service (including kuberay-operator, seldon-webhook-service)
proxy / api-gateway
query
solr
templating
webapps

JSON logging is off by default. To enable it, you can set jsonOutput: true globally or for specific services in the values.yaml configuration file. This update does not affect log formats in the following services:

admin-ui
auth-ui
insights
pm-ui
rules-ui
argo
kafka
spark
zookeeper

Transport Layer Security (TLS)

In Fusion releases 5.9.2 and later, Fusion microservices can use Enable Transport Layer Security (TLS) for Fusion Microservices.

Enable Transport Layer Security (TLS) for Fusion Microservices

This feature is only available in Fusion releases 5.9.2 and later.

This article describes how to deploy Fusion with Transport Layer Security (TLS) enabled for Fusion microservices.When enabled, Fusion generates a TLS certificate for each pod when the pod starts. This allows Fusion to use the Kubernetes endpoints API to reach each pod by its IP address and perform load balancing, circuit breaking, and retries in the Fusion microservices.In order to facilitate the TLS operations, Fusion utilizes Jetstack’s **cert-manager** add-on to provision a certificate for each pod. This certificate contains the pods’ IP address.

It is not possible to update an existing cluster enable or disable TLS. These instructions apply to new deployments only.

Install Jetstack cert-manager

Add the Jetstack helm repo.

helm repo add jetstack https://charts.jetstack.io

Update the local cache.
```
helm repo update
```

Create the CRDs required for Jetstack. For Jetstack v1.12.4:

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.12.4/cert-manager.crds.yaml

For Jetstack v1.13.1:

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.1/cert-manager.crds.yaml

Create the namespace for cert-manager.

kubectl create namespace "cert-manager"

Install cert-manager into the namespace.

ForJetstack v1.12.4:

helm upgrade --install --namespace "cert-manager" cert-manager jetstack/cert-manager --version 1.12.4 --set 'extraArgs[0]=--enable-certificate-owner-ref=true'

For Jetstack v1.13.1:

helm upgrade --install --namespace "cert-manager" cert-manager jetstack/cert-manager --version 1.13.1 --set 'extraArgs[0]=--enable-certificate-owner-ref=true'

You must only complete this process once per Fusion cluster. All namespaces in the cluster are affected by this process.

Prepare the namespace for Fusion

Create the namespace to install Fusion into.
```
kubectl create namespace ${KUBE_NAMESPACE}
```

Create the Root CA certificate for the namespace that will be used to sign all certificates in the namespace.

cat  <<EOF | cfssl genkey -initca - | cfssljson -bare ca
{
    "hosts": [
    ],
    "key": {
        "algo": "rsa",
        "size": 4096
    },
    "names": [
        {
            "C":  "US",
            "L":  "San Francisco",
            "O":  "Lucidworks",
            "OU": "Engineering",
            "ST": "California"
        }
    ]
}
EOF

kubectl --namespace "${KUBE_NAMESPACE}" create secret generic cert-manager-ca --from-literal=tls.crt="$(cat ca.pem)" --from-literal=tls.key="$(cat ca-key.pem)"

Create a cert-manager issuer to sign CSRs in the namespace. For Jetstack v1.12.4:

cat  > ca-issuer.yaml <<EOF
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: ${KUBE_NAMESPACE}-ca-issuer
spec:
  ca:
    secretName: cert-manager-ca
EOF
kubectl --namespace "${KUBE_NAMESPACE}" apply -f ca-issuer.yaml

For Jetstack v1.13.1:

cat  > ca-issuer.yaml <<EOF
apiVersion: cert-manager.io/v1alpha2
kind: Issuer
metadata:
  name: ${KUBE_NAMESPACE}-ca-issuer
spec:
  ca:
    secretName: cert-manager-ca
EOF
kubectl --namespace "${KUBE_NAMESPACE}" apply -f ca-issuer.yaml

Install Fusion with the following parameters:

helm install... --set global.tlsEnabled=true --set global.tlsIssuerRef=${KUBE_NAMESPACE}-ca-issuer --set global.zkPort=2281 --set global.kafkaPort=9092 --set kafka.auth.clientProtocol=tls --set global.zkReplicaCount=3

Be sure to include the flags --set global.zkReplicaCount=3 and --set kafka.auth.clientProtocol=tls or Kafka can enter a crash loop state.

When enabled, Fusion generates a TLS certificate for each pod when the pod starts. This allows Fusion to use the Kubernetes endpoints API to reach each pod by its IP address and perform load balancing, circuit breaking, and retries in the Fusion microservices.

Get Started

Introduction to Fusion

Getting Data In

Getting Data Out

Operations

Reference

Developer Docs

Neural Hybrid Search

Release Notes

Get deployment details

Overview

Microservices

Ports used by Fusion

Standardized component logging using structured JSON format

Transport Layer Security (TLS)

Install Jetstack cert-manager

Prepare the namespace for Fusion

Get Started

Introduction to Fusion

Getting Data In

Getting Data Out

Operations

Reference

Developer Docs

Neural Hybrid Search

Release Notes

​Get deployment details

​Overview

​Microservices

​Ports used by Fusion

​Standardized component logging using structured JSON format

​Transport Layer Security (TLS)

​Install Jetstack cert-manager

​Prepare the namespace for Fusion

Get deployment details

Overview

Microservices

Ports used by Fusion

Standardized component logging using structured JSON format

Transport Layer Security (TLS)

Install Jetstack cert-manager

Prepare the namespace for Fusion