> ## Documentation Index
> Fetch the complete documentation index at: https://doc.lucidworks.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Built-in SQL Aggregation Jobs

export const LwTemplate = ({title = "Key questions to get you started", icon = "sparkles", cta = "Powered by Agent Studio", linkHref = "https://lucidworks.com/demo/?utm_source=docs&utm_medium=referral&utm_campaign=docs_cta_ai"}) => {
  const [isLoaded, setIsLoaded] = useState(false);
  useEffect(() => {
    const timer = setTimeout(() => {
      setIsLoaded(true);
    }, 500);
    return () => clearTimeout(timer);
  }, []);
  return <div className="lw-template-container">
      <Card title={title} icon={icon}>
        {isLoaded && <span dangerouslySetInnerHTML={{
    __html: `<lw-template id="a029c1a9-28be-427e-b0e1-5d918920246a"></lw-template
            >`
  }} />}
        <Link href={linkHref} className="agent-studio-link text-left text-gray-600 gap-2 dark:text-gray-400 text-sm font-medium flex flex-row items-center hover:text-primary dark:hover:text-primary-light group-hover:text-primary group-hover:dark:text-primary-light">Powered by Lucidworks Agent Studio</Link>
      </Card>
    </div>;
};

[localhost link]: http://localhost:3000/docs/4/fusion-ai/reference/aggregations/built-in-sql-aggregation-jobs

[mintlify link]: https://doc.lucidworks.com/docs/4/fusion-ai/reference/aggregations/built-in-sql-aggregation-jobs

[old doc.lw link]: https://doc.lucidworks.com/fusion/5.9/585

<LwTemplate />

## Built-in SQL aggregation jobs

**Enable or Disable Signals** automatically creates the necessary `_signals` and `_signals_aggr` collections, plus several [Parameterized SQL Aggregation jobs](/docs/4/fusion-ai/reference/jobs/parameterized-sql-aggregation) for signal processing and aggregation:

<Accordion title="Enable or Disable Signals">
  You can enable and disable signals using the [Fusion UI](#using-the-ui) or the [REST API](#using-the-collection-features-api).

  <Tip>When you disable signals, the [aggregation jobs](/docs/4/fusion-ai/concepts/signals-and-aggregations/aggregations/sql-aggregations) are deleted, but the `_signals` and `_signals_aggr` collections are not, your legacy signal data remains intact.</Tip>

  ## Using the UI

  When you create a collection using the Fusion UI, signals are enabled and a signals collection created by default.  You can also enable and disable signals for existing collections using the Collections Manager.

  **Enable signals for a collection**

  1. In the Fusion workspace, navigate to **Collections** > **Collections Manager**.
  2. Hover over the primary collection for which you want to enable signals.
  3. Click <img className="inline-image" src="https://mintcdn.com/lucidworks/5yWZ-KtZuBe4Y_Fg/assets/images/4.0/icons/configure.png?fit=max&auto=format&n=5yWZ-KtZuBe4Y_Fg&q=85&s=5e31868577432815d5efac37dfac8532" width="56" height="54" data-path="assets/images/4.0/icons/configure.png" /> **Configure** to open the drop-down menu.

       <img src="https://mintcdn.com/lucidworks/qCaM85k6rX7hs1DP/assets/images/4.0/signals-enable.png?fit=max&auto=format&n=qCaM85k6rX7hs1DP&q=85&s=d75eef5c5768c4ec35a52310afc5238e" alt="Enable Signals" width="2459" height="967" data-path="assets/images/4.0/signals-enable.png" />
  4. Click **Enable Signals**.\
     The **Enable Signals** window appears, with a list of collections and jobs that are created when you enable signals.

       <img src="https://mintcdn.com/lucidworks/qCaM85k6rX7hs1DP/assets/images/4.0/signals-enable2.png?fit=max&auto=format&n=qCaM85k6rX7hs1DP&q=85&s=4f661b9fab49e6e1cfa0786e439c20e3" alt="Enable Signals" width="2560" height="1336" data-path="assets/images/4.0/signals-enable2.png" />
  5. Click **Enable Signals**.

  **Disable signals for a collection**

  1. In the Fusion workspace, navigate to **Collections** > **Collections Manager**.
  2. Hover over the primary collection for which you want to disable signals.
  3. Click <img className="inline-image" src="https://mintcdn.com/lucidworks/5yWZ-KtZuBe4Y_Fg/assets/images/4.0/icons/configure.png?fit=max&auto=format&n=5yWZ-KtZuBe4Y_Fg&q=85&s=5e31868577432815d5efac37dfac8532" width="56" height="54" data-path="assets/images/4.0/icons/configure.png" /> **Configure** to open the drop-down menu.
  4. Click **Disable Signals**.\
     The **Disable Signals** window appears, with a list of jobs that are created when you enable signals.
  5. Click **Disable Signals**.\
     Your `_signals` and `_signals_aggr` collections remain intact so that you can access your legacy signals data.

  ## Using the Collection Features API

  Using the API, the [`/collections/{collection}/features/{feature}`](/api-reference/collections/get-collection-features) endpoint enables or disables signals for any collection:

  **Check whether signals are enabled for a collection**

  ```bash wrap theme={"dark"}
  curl -u USERNAME:PASSWORD http://localhost:{api-port}/api/collections/COLLECTION_NAME/features/signals
  ```

  **Enable signals for a collection**

  ```json wrap theme={"dark"}
  curl -u USERNAME:PASSWORD -X PUT -H "Content-type: application/json" -d '{"enabled" : true}' http://localhost:{api-port}/api/collections/COLLECTION_NAME/features/signals
  ```

  **Disable signals for a collection**

  ```json wrap theme={"dark"}
  curl -u USERNAME:PASSWORD -X PUT -H "Content-type: application/json" -d '{"enabled" : false}' http://localhost:{api-port}/api/collections/COLLECTION_NAME/features/signals
  ```
</Accordion>

**Signals aggregation jobs**

| Job                                                                                                       | Default input collection  | Default output collection      | Default schedule |
| --------------------------------------------------------------------------------------------------------- | ------------------------- | ------------------------------ | ---------------- |
| [`COLLECTION_NAME_click_signals_aggregation`](#collection-name-click-signals-aggregation)                 | `COLLECTION_NAME_signals` | `COLLECTION_NAME_signals_aggr` | Every 15 minutes |
| [`COLLECTION_NAME_session_rollup`](#collection-name-session-rollup)                                       | `COLLECTION_NAME_signals` | `COLLECTION_NAME_signals`      | Every 15 minutes |
| [`COLLECTION_NAME_user_item_preferences_aggregation`](#collection-name-user-item-preferences-aggregation) | `COLLECTION_NAME_signals` | `COLLECTION_NAME_signals_aggr` | Once per day     |
| [`COLLECTION_NAME_user_query_history_aggregation`](#collection-name-user-query-history-aggregation)       | `COLLECTION_NAME_signals` | `COLLECTION_NAME_signals_aggr` | Once per day     |

When signals are enabled, you can view these jobs at **Collections** > **Jobs**. Each one is explained in more detail below.

<a name="collection-name-click-signals-aggregation" />

### COLLECTION\_NAME\_click\_signals\_aggregation

The `COLLECTION_NAME_click_signals_aggregation` job computes a time-decayed weight for each document, query, and filters group in the signals collection. Fusion computes the weight for each group using an exponential time-decay on signal count (30 day half-life) and a weighted sum based on the signal type. This approach gives more weight to a signal that represents a user purchasing an item than to a user just clicking on an item.

You can customize the signal types and weights for this job by changing the `signalTypeWeights` SQL parameter in the Fusion Admin UI.

<img src="https://mintcdn.com/lucidworks/qCaM85k6rX7hs1DP/assets/images/4.0/signaltypeweights.png?fit=max&auto=format&n=qCaM85k6rX7hs1DP&q=85&s=360a9921e6ceb19b3560ac2b11ee2bdb" alt="signalTypeWeights" width="1186" height="281" data-path="assets/images/4.0/signaltypeweights.png" />

When the SQL aggregation job runs, Fusion translates the `signalTypeWeights` parameter into a `WHERE IN` clause to filter signals by the specified types (click, cart, purchase), and also passes the parameter into the `weighted_sum` SQL function. Notice that Fusion only displays the SQL parameters and not the actual SQL for this job. This is to simplify the configuration because, in most cases, you only need to change the parameters and not worry about the actual SQL. However, if you need to change the SQL for this job, you can edit it under the **Advanced** toggle on the form.

<Tip>
  A user can configure the `COLLECTION_NAME_click_signals_aggregation` job to use a parquet file as the source of raw signals instead of a signal Fusion collection.
</Tip>

1. Use [catalog api](/docs/4/fusion-server/reference/api/catalog-api) to set up a "catalog project” in Fusion:

   ```
   sample code:
   curl -u <username>:<pw> -X POST -H "Content-type:application/json" --data-binary '{
     "name": "fusion_test",
     "assetType": "project",
     "description": "test",
     "cacheOnLoad": false
   }' http://localhost:{api-port}/api/catalog
   ```

2. Create an assets table in the project created in previous step:

   ```
   sample code:
   curl -u <username>:<pw> -X POST -H "Content-type:application/json" --data-binary '{
     "name": "doc_test",
     "assetType": "table",
     "projectId": "fusion_test",
     "description": "for documentation",
     "tags": ["fusion"],
     "format": "parquet",
     "cacheOnLoad": false,
     "options" : [ "path -> <path to your .parquet file>"]
   }' http://localhost:{api-port}/api/catalog/fusion_test/assets
   ```

<Note>
  The parquet file listed above needs to have all the fields in the SQL script which the `COLLECTION_NAME_click_signals_aggregation` job is selecting/using.
</Note>

3. In the `COLLECTION_NAME_click_signals_aggregation` job, change “source” from `${collection}_signals` to `catalog:${project_name}.${asset_name}` (e.g. `catalog:fusion_test.doc_test` per the sample code).

   <img src="https://mintcdn.com/lucidworks/NR6PWuMFSzL-y-FO/assets/images/4.2/catalog_input.png?fit=max&auto=format&n=NR6PWuMFSzL-y-FO&q=85&s=6d2472af20225bf96c22c59516e1cffb" alt="catalog input" width="1938" height="1188" data-path="assets/images/4.2/catalog_input.png" />
4. Start the job.

<a name="collection-name-session-rollup" />

### COLLECTION\_NAME\_session\_rollup

The `COLLECTION_NAME_session_rollup` job aggregates related user activity into a session signal that contains activity count, duration, and keywords (based on user search terms). The Fusion App Insights application uses this job to show reports about user sessions. Use the `elapsedSecsSinceLastActivity` and `elapsedSecsSinceSessionStart` parameters to determine when a user session is considered to be complete. You can edit the SQL using the **Advanced** toggle.

The `COLLECTION_NAME_session_rollup` job uses signals as the input collection and output collection. Unlike other aggregation jobs that write aggregated documents to the `COLLECTION_NAME_signals_aggr` collection, the `COLLECTION_NAME_session_rollup` job creates session signals and saves them to the `COLLECTION_NAME_signals` collection.

<a name="collection-name-user-item-preferences-aggregation" />

### COLLECTION\_NAME\_user\_item\_preferences\_aggregation

The `COLLECTION_NAME_user_item_preferences_aggregation` job computes an aggregated weight for each user/item combination found in the signals collection. The weight for each group is computed using an exponential time-decay on signal count (30 day half-life) and a weighted sum based on the signal type.

<Note>
  This job is a prerequisite for the [ALS recommender job](/docs/4/fusion-ai/reference/jobs/als-recommender).
</Note>

**Job configuration tips:**

* In the job configuration panel, click **Advanced** to see all of the available options.
* When aggregating signals for the first time, uncheck the **Aggregate and Merge with Existing** checkbox. In production, once the jobs are running automatically then this box can be checked. Note that if you want to discard older signals then by unchecking this box those old signals will essentially be replaced completely by the new ones.
* If the original signal data has missing fields, edit the SQL query to fill in missing values for fields such as “count\_i” (the number of times a user interacted with an item in a session).
* Sometimes the aggregation job can run faster by unchecking the **Job Skip Check Enabled** box. Do this when first loading the signals.
* Use the `signalTypeWeights` SQL parameter to set the correct signal types and weights for your dataset. Its value is a comma-delimited list of signal types and their stakeholder-defined level of importance. Think of this numeric value as a weight that tells which type of signal is most important for determining a user’s interest in an item. An example of how to weight the signal types is shown below:

  ```
  signal_type_1:1.0, signal_type_2: 3.0, signal_type_3: 20.0
  ```

  [Rank your signal types](/docs/4/fusion-ai/concepts/signals-and-aggregations/signals/overview) to determine which types should be added. Add only the signal types that are significant. Signal types that are not added to the list will not be included in the aggregation job, and for some signal types this is fine.

  The weights should be within orders of magnitude of each other. The spread of values should not be wide. For instance, `click:1.0, cart:100000.0` is too wide of a spread. The values of `click:1.0` and `cart:50.0` would be a reasonable setting, indicating that the signal type of `cart` is 50 times more important for measuring a user’s interest in an item.
* The Time Range field value is used in a weight decay function that reduces the importance of signals the older they are. This time range is in days and the default is 30 days. If you want to increase this time because the time duration of your signals is greater than 30 days, edit the SQL query to reflect the desired number of days. The SQL query is visible when you click **Advanced** in the job configuration panel. Modify the following line in the SQL query, changing "30 days" to your desired timeframe:

  ```
  time_decay(count_i, timestamp_tdt, "30 days", ref_time, weight_d) AS typed_weight_d
  ```

If recommendations are enabled for your collection, then the ALS recommender job is automatically created with the name `COLLECTION_NAME_item_recommendations` and scheduled to run after this job completes. Consequently, you should only run this aggregation once or twice a day, because training a recommender model is a complex, long-running job that requires significant resources from your Fusion cluster.

<a name="collection-name-user-query-history-aggregation" />

### COLLECTION\_NAME\_user\_query\_history\_aggregation

The `COLLECTION_NAME_user_query_history_aggregation` job computes an aggregated weight for each user/query combination found in the signals collection. The weight for each group is computed using an exponential time-decay on signal count (30 day half-life) and a weighted sum based on the signal type. Use the `signalTypeWeights` parameter to set the correct signal types and weights for your dataset. You can use the results of this job to boost queries for a user based on their past query activity.
