Legacy Aggregations

Note
This aggregation approach is still available, though it is deprecated and will be removed in a future release. We now refer to this aggregation approach as "legacy aggregations."

Signals are most useful when they are aggregated into a set of summaries that can be used to enrich the search experience through recommendations and boosting.

Aggregation jobs are a subtype of Spark jobs.

When signals are enabled for a "primary" collection, a <primarycollectionname>_signals collection and a <primarycollectionname>_signals_aggr collection are created automatically.

Aggregation Pipelines

Aggregated events are indexed, and use a default pipeline named "aggr_rollup". This pipeline contains one stage, a Solr Indexer stage to index the aggregated events.

You can create your own custom index pipeline to process aggregated events differently if you choose.

Aggregation Functions

The section Aggregator Functions documents the available set of aggregation functions.

Custom aggregation functions can be defined via a JavaScript stage.

Aggregation job configuration

The groupingFields should use just user_id_s, and optionally the "sort" parameter should be set to timestamp_tdt asc - this way the sessionization process will work most efficiently. On the other hand, sorting by timestamp requires more work on the Solr-side, so it may be omitted, with the possible side-effect that there will be additional partial documents created.