|This aggregation approach is still available, though it is deprecated and will be removed in a future release. We now refer to this aggregation approach as "legacy aggregations."|
Aggregation jobs are a subtype of Spark jobs.
When signals are enabled for a "primary" collection, a
<primarycollectionname>_signals collection and a
<primarycollectionname>_signals_aggr collection are created automatically.
Aggregated events are indexed, and use a default pipeline named "aggr_rollup". This pipeline contains one stage, a Solr Indexer stage to index the aggregated events.
You can create your own custom index pipeline to process aggregated events differently if you choose.
The section Aggregator Functions documents the available set of aggregation functions.
The groupingFields should use just
user_id_s, and optionally the "sort" parameter should be set to
timestamp_tdt asc - this way the sessionization process will work most efficiently. On the other hand, sorting by timestamp requires more work on the Solr-side, so it may be omitted, with the possible side-effect that there will be additional partial documents created.