Creating Aggregation Jobs

Aggregations are created automatically whenever you enable signals or recommendations. This topic explains how to create or modify aggregations individually. You can do this using the Fusion UI or the Jobs API.

As of Fusion 3.1, the Signals Aggregator API is deprecated in favor of the Jobs API. This changes the API endpoint from /aggregator to /jobs. Aggregation jobs are a subtype of Spark jobs.

Creating an aggregation job using the Fusion UI

An aggregation is a type of job. Aggregation jobs can be created or modified at Search > Jobs in the Fusion UI.

  1. Navigate to Search > Jobs.

  2. Click Add.

    New aggregation

  3. Select Aggregation.

    The New Job Configuration panel appears.

    New aggregation

  4. Enter an arbitrary Spark job ID.

  5. Enter the name of the signals collection to be aggregated.

    Be sure to specify the signals collection (usually <primarycollectionname>_signals), not the primary (<primarycollectionname>) collection.
  6. Under Aggregation Settings, click include.

  7. Configure the aggregation parameters as needed.

    See Aggregation configuration parameters below for descriptions.

  8. Click Save.

    The new aggregation job appears in the jobs list. Now you can run it or schedule it.

Aggregation configuration parameters


An array of strings specifying the fields to group on.


The signal types. If not set then any signal type is selected.


The query to select the desired signals. If not set then *:* will be used, or equivalent.


The criteria to sort on within a group. If not set then sort order is by ID, ascending.


The time range to select signals on.


What pipeline to use to process the output. If not set then _system pipeline will be used.


Pipeline to use for processing results of roll-up. This is by default the same index pipeline used for processing the aggregation results.


The aggregator to use when rolling up. If not set then the same aggregator will be used for roll-up.


The collection to write the aggregates to on output. This property is required if the selected output/rollup pipeline requires it (the default pipeline does). A special value of - disables the output.


Aggregator implementation to use. This is either one of the symbolic names (simple, click, em) or a fully-qualified class name of a class extending EventAggregator. If not set then 'simple' is used.


If true, the processed source signals will be removed after aggregation. Default is false.


If true, only aggregate the signals since the last time the job was successfully run. If there is a record of such previous run then this overrides the starting time of time range set in timeRange property.


Roll-up current results with all previous results for this aggregation id, which are available in outputCollection.


List of functions defining how to aggregate events with results. Aggregation functions have these properties:

  • type

    The function type defining how to aggregate events with results.

  • sourceFields

    The fields that the function will read from.

  • targetField

    The field that the function will write to.

  • mapper

    When true the function will be used in map phase only.

  • parameters

    Other parameters specific to individual functions.


List of numeric fields in results for which to compute overall statistics.


Other aggregation parameters (such as start / aggregate / finish scripts, cache size, and so on).