Skip to main content
KEDA (Kubernetes Event-Driven Autoscaling) provides flexible, event-driven autoscaling for Fusion workloads as an alternative to the default Kubernetes Horizontal Pod Autoscaler (HPA). Traditional HPA-based autoscaling relies primarily on CPU and memory utilization. KEDA extends this by enabling you to scale workloads based on events, schedules, and business metrics, allowing you to align infrastructure capacity more closely with actual demand.

Benefits

KEDA provides these advantages for Fusion workloads:
  • Event-driven autoscaling - Scale in response to external signals such as Prometheus metrics, queue depth, or pipeline execution load.
  • Scheduled scaling - Automatically scale workloads based on predictable traffic patterns, such as business hours or peak usage windows.
  • Scale-to-zero capability - Reduce infrastructure costs by scaling workloads down to zero during off-hours or periods of inactivity.
  • Improved operational efficiency - Align infrastructure capacity with real business demand instead of relying solely on CPU or memory thresholds.
KEDA autoscaling over a 24-hour period

Supported services

The following Fusion services support KEDA autoscaling:
  • api-gateway
  • query-pipeline
  • fusion-indexing

Before you begin

Before configuring KEDA for Fusion services, ensure you have:
  • KEDA version 2.18.3 or later installed in your Kubernetes cluster
  • Access to the Fusion Helm charts
  • Ability to provide a custom values file for your Fusion deployment
If KEDA isn’t already installed in your cluster, install it using Helm:
# Add the KEDA Helm repository
helm repo add kedacore https://kedacore.github.io/charts

# Update your local Helm chart repository cache
helm repo update

# Install KEDA
helm install keda kedacore/keda --version 2.18.3 --namespace keda --create-namespace
For more information, see the official KEDA documentation.

How KEDA autoscaling works

Fusion Helm charts support two mutually exclusive autoscaling mechanisms per workload:
  1. Kubernetes HPA (default) - Scales based on CPU and memory metrics.
  2. KEDA ScaledObject - Scales based on events, schedules, and custom metrics.
Only one autoscaling mechanism can be active per workload at a time. You can’t use both HPA and KEDA for the same service.
However, you can mix autoscaling approaches across different services. For example, use HPA for api-gateway and KEDA for query-pipeline.

Autoscaling parameters

Each Fusion service has three key autoscaling parameters:
ParameterDescriptionDefault
autoscaling.enabledMaster switch to enable autoscalingfalse
autoscaling.hpa.enabledEnables HPA autoscalingtrue
autoscaling.keda.enabledEnables KEDA autoscalingfalse

Default behavior

When autoscaling.enabled is true and you don’t modify other settings, Fusion creates an HPA by default. This preserves backward compatibility with existing deployments.

Mutual exclusion

The Helm chart enforces these rules:
  • If autoscaling.enabled is false → No autoscaling (neither HPA nor KEDA)
  • If hpa.enabled is true and keda.enabled is false → HPA autoscaling
  • If hpa.enabled is false and keda.enabled is true → KEDA autoscaling
  • If both hpa.enabled and keda.enabled are true → No autoscaling (conflict resolution)
When KEDA creates a ScaledObject with certain trigger types, it may create its own HPA prefixed with keda-hpa-. This is expected behavior and indicates that KEDA is managing autoscaling correctly.

Enable KEDA

You can enable KEDA for the api-gateway, query-pipeline, or fusion-indexing services by editing the Fusion values file before a Fusion deployment or upgrade, as explained in the next section below.

Update the Fusion values file

The configuration for each Fusion service is the same, as shown below:
api-gateway:
  autoscaling:
    enabled: true
    hpa:
      enabled: false  # Disable HPA
    keda:
      enabled: true   # Enable KEDA

Deploy or upgrade Fusion

After preparing your custom values file (for example, fusion-values.yaml), deploy or upgrade Fusion:
helm upgrade --install <RELEASE_NAME> <FUSION_CHART_PATH> -f fusion-values.yaml
Replace <RELEASE_NAME> and <FUSION_CHART_PATH> with your specific values.
To find your existing Fusion release name, run helm list -n <namespace>. The chart path can be a repository reference (for example, lucidworks/fusion) or a local path to the chart directory.

Verify the configuration

After deployment, verify that KEDA is active for the configured services.
Check for the ScaledObject:
kubectl get scaledobject -n <FUSION_NAMESPACE> api-gateway
The api-gateway ScaledObject should appear in the output.Verify that no conflicting HPA exists:
kubectl get hpa -n <FUSION_NAMESPACE> api-gateway
No HPA should exist for api-gateway, except those prefixed with keda-hpa- (which are managed by KEDA).
Check for the ScaledObject:
kubectl get scaledobject -n <FUSION_NAMESPACE> query-pipeline
The query-pipeline ScaledObject should appear in the output.Verify that no conflicting HPA exists:
kubectl get hpa -n <FUSION_NAMESPACE> query-pipeline
No HPA should exist for query-pipeline, except those prefixed with keda-hpa- (which are managed by KEDA).
Check for the ScaledObject:
kubectl get scaledobject -n <FUSION_NAMESPACE> fusion-indexing
The fusion-indexing ScaledObject should appear in the output.Verify that no conflicting HPA exists:
kubectl get hpa -n <FUSION_NAMESPACE> fusion-indexing
No HPA should exist for fusion-indexing, except those prefixed with keda-hpa- (which are managed by KEDA).
KEDA may create an HPA prefixed with keda-hpa- followed by your release name and service name. For example: keda-hpa-fusion-query-pipeline. This is expected when KEDA uses certain trigger types.

Custom configuration parameters

You can customize the scaling behavior, metadata, and the triggers KEDA monitors using the parameters in the table below. For the api-gateway, query-pipeline, or fusion-indexing services, replace [service-name] with the name of the service.
ParameterDescriptionDefault
[service-name].autoscaling.keda.labelsLabels to add to the ScaledObject.{}
[service-name].autoscaling.keda.annotationsAnnotations to add to the ScaledObject.{}
[service-name].autoscaling.keda.pollingIntervalInterval in seconds at which KEDA checks triggers.30
[service-name].autoscaling.keda.cooldownPeriodTime in seconds KEDA waits after scale-down before re-evaluating.300
[service-name].autoscaling.keda.initialCooldownPeriodTime in seconds KEDA waits after scale-out before re-evaluating.0
[service-name].autoscaling.keda.idleReplicaCountReplicas to maintain when triggers are inactive (enables scale-to-zero).Disabled
[service-name].autoscaling.keda.minReplicasMinimum number of replicas.1
[service-name].autoscaling.keda.maxReplicasMaximum number of replicas.5
[service-name].autoscaling.keda.advancedAdvanced HPA behavior configuration.{}
[service-name].autoscaling.keda.fallbackFallback HPA configuration if KEDA encounters errors.{}
[service-name].autoscaling.keda.triggersList of KEDA triggers (required).[]
At least one trigger is required when KEDA is enabled. Without triggers, KEDA can’t determine when to scale your workload.
This example demonstrates a custom KEDA configuration for query-pipeline with these components:
  • Cron trigger - Scales to 6 replicas during business hours (Monday-Friday, 7:00 AM - 6:00 PM Central Time).
  • CPU trigger - Scales up to 15 replicas when CPU utilization exceeds 60%.
query-pipeline:
  autoscaling:
    enabled: true
    hpa:
      enabled: false
    keda:
      enabled: true
      pollingInterval: 40  # Check triggers every 40 seconds
      minReplicas: 2
      maxReplicas: 15
      triggers:
        # Scheduled scaling
        - type: cron
          metadata:
            timezone: "America/Chicago"
            start: "0 7 * * 1-5"      # 7:00 AM weekdays
            end: "0 18 * * 1-5"       # 6:00 PM weekdays
            desiredReplicas: "6"
        # Reactive scaling
        - type: cpu
          metricType: Utilization
          metadata:
            value: "60"  # Scale when CPU exceeds 60%
For more information about available KEDA triggers, see the KEDA scalers documentation.

Troubleshooting

This section provides troubleshooting steps for some common issues with KEDA configuration.
  1. Verify that autoscaling.enabled is true.
  2. Verify that autoscaling.keda.enabled is true.
  3. Verify that autoscaling.hpa.enabled is false.
  4. Check that KEDA is installed: kubectl get pods -n keda.
  5. Review Helm deployment logs for errors.
  1. Check your configuration for conflicting settings.
  2. Ensure that both hpa.enabled and keda.enabled aren’t set to true.
  3. Delete the unwanted resource manually if necessary.
  1. Verify that triggers are configured correctly.
  2. Check KEDA operator logs: kubectl logs -n keda -l app=keda-operator
  3. Describe the ScaledObject to see its status: kubectl describe scaledobject <name> -n <namespace>
  4. Verify that the trigger source (metrics endpoint, queue, and so on) is accessible.