Fusion Microservices

Overview

The table below lists the Fusion microservices deployed by our Helm chart. Recognize that Fusion is a complex distributed application composed of many stateful and stateless services designed to support demanding search-oriented workloads at high scale.

Microservice Protocol Deployment or StatefulSet Node Pool Assignment Autoscaling Supported Description

admin

REST/HTTP

Deployment

system

Not required. Minimum of 1 but 2 pods are recommended for HA

Exposes endpoints for admin tasks, such as creating applications and running jobs.

admin-ui

Web

Deployment

system

Not required; only 1 pod should be sufficient for most clusters

Serves static Web assets for the admin UI.

argo

HTTP

Deployment

system

Yes (CPU or custom metric)

Orchestrates parallel jobs on Kubernetes.

argo-ui

Web

Deployment

system

Not required; only 1 pod should be sufficient for most clusters

Stores logs and prior Argo workflow runs.

auth-ui

Web

Deployment

system

Not required; only 1 pod should be sufficient for most clusters

Serves static Web assets for the login form.

config-sync

HTTP

Deployment

system

Not required; only 1 pod should be sufficient for most clusters

Synchronizes config between GitHub and Fusion.

connectors-classic

REST/HTTP

StatefulSet

analytics or system

Yes (CPU or custom metric)

REST service for supporting non-RPC connector plugins.

connectors-rest

REST/HTTP

Deployment

analytics or system

Not required; only 1 pod should be sufficient for most clusters

Routes REST API requests to connectors-classic and connectors-rpc.

connectors-rpc

gRPC

Deployment

analytics or system

Yes (CPU or custom metric)

gRPC service for managing SDK-based connector plugins.

devops-ui

Web

Deployment

system

Not required; only 1 pod should be sufficient for most clusters

Serves static Web assets for the DevOps UI.

indexing

REST/HTTP

Deployment

search or analytics depending on write-volume

Yes (CPU or custom metric)

Processes indexing requests.

insights

Web

Deployment

system

Not required; only 1 pod should be sufficient for most clusters

Serves the App Insights UI

job-launcher

REST/HTTP

Deployment

analytics

Not required; only 1 pod should be sufficient for most clusters

Configures and aunches the Spark driver pod for running Spark jobs

job-rest-server

REST/HTTP

Deployment

analytics

Not required; only 1 pod should be sufficient for most clusters

Performs admin tasks for creating and running Spark jobs.

jupyter

HTTP

Deployment

analytics

Not required; only 1 pod should be sufficient for most clusters

Jupyter notebook for ad hoc analytics and visualization.

logstash

HTTP

StatefulSet

system

Not required. Minimum of 1 but 2 pods are recommended for HA

Collects logs from the other microservices and either indexes into system_logs or ships them to an external service like Elastic

milvus

REST/HTTP

Deployment

analytics or system

Not required; only 1 pod should be sufficient for most clusters

Dense Vector Search Engine for ML models active.

milvus-mysql

REST/HTTP

Deployment

analytics or system

Not required; only 1 pod should be sufficient for most clusters

Handles metadata for Milvus service active.

ml-model-service

REST/HTTP and gRPC

Deployment

search

Yes (CPU or custom metric)

Exposes gRPC endpoints for generating predictions from ML models.

pm-ui

Web

Deployment

system

Not required; only 1 pod should be sufficient for most clusters

Serves static Web assets for the Predictive Merchandiser app.

proxy / api-gateway

HTTP

Deployment

search

Not required. Minimum of 1 but 2 pods are recommended for HA

Performs authentication, authorization, and traffic routing.

pulsar-bookkeeper

HTTP

StatefulSet

search

Atleast 3 nodes in HA, you need to run 3 or 5 to ensure a quorum

Write Ahead Log (WAL) used for persistent message storage.

pulsar-broker

HTTP and TCP

Deployment

search

Atleast 3 nodes in HA

Contains REST API for managing administration and dispatcher for handling all message transfers.

query

REST/HTTP

Deployment

search

Yes (CPU or custom metric)

Processes query requests.

rules-ui

Web

Deployment

system

Not required; only 1 pod should be sufficient for most clusters

Serves static Web assets for the Rules UI.

seldon-ambassador

Web

Deployment

system

Not required. Minimum of 1 but 2 pods are recommended for HA

Load balancing and proxy for Seldon Core deployments.

seldon-core

REST/GRPC

Deployment

system

Yes (CPU or custom metric)

Serves models built in any model building framework.

seldon-webhook-service

Web

Deployment

system

Not required; only 1 pod should be sufficient for most clusters

Maintains Seldon Core deployments for ML model serving active.

solr

HTTP

StatefulSet

At least 3 nodes in search, 2 in analytics, and 2 in system

Yes (CPU or custom metric)

Search engine.

spark-driver

n/a

single pod per job

analytics or dedicated Node Pool for Spark jobs

1 per job

Launched by the job-launcher to run a Spark job

spark-executor

n/a

one or more pods launched by the Spark driver for executing job tasks

analytics or dedicated Node Pool for Spark jobs

depends on job configuration; controlled by the spark.executor.instances setting

Executes tasks for a Spark job

sql-service

REST/HTTP and JDBC

Deployment

analytics

Not required; only 1 pod should be sufficient for most clusters

Performs admin tasks for creating and managing SQL catalog assets.

Exposes a JDBC endpoint for the SQL service.

templating

Web

Deployment

system

Not required; only 1 pod should be sufficient for most clusters

Retrieves and renders Predictive Merchandiser templates.

webapps

REST/HTTP

Deployment

system

Not required; only 1 pod should be sufficient for most clusters

Serves App Studio-based Web apps.

zookeeper

TCP

StatefulSet

system

No, you need to run 1,3, or 5 Zookeeper pods to ensure a quorum; HPA should not be used for scaling ZK

Stores centralized configuration and performs distributed coordination tasks.

Transport Layer Security (TLS)

When enabled, Fusion generates a TLS certificate for each pod when the pod starts. This allows Fusion to use the Kubernetes endpoints API to reach each pod by its IP address and perform load balancing, circuit breaking, and retries in the Fusion microservices.