Skip to main content

Documentation Index

Fetch the complete documentation index at: https://doc.lucidworks.com/llms.txt

Use this file to discover all available pages before exploring further.

In addition to the SQL functions provided by Spark, Fusion provides several additional functions to simplify common aggregation tasks. To recap, a UDAF aggregates multiple rows for the same group by key and a UDF performs some operation on a single row. weighted_sum The weighted_sum UDAF takes a weight, type, and type-weight mapping to produce an aggregated weight. For example, consider the following SQL snippet:
    SELECT query,
          doc_id,
          filters,
          weighted_sum(typed_weight_d, type, 'click:1.0,cart:10.0') AS weight_d
     FROM signal_type_groups
 GROUP BY query, doc_id, filters
When applied to the rows in the table below, the weighted_sum function produces a final weight_d of 12.0 (21.0 + 110.0). The UDAF is passed rows grouped by query, doc_id, and filters.
querytypedoc_idfilterstyped_weight_d
iPadclick1gear2
iPadcart1gear1
significantTerms
The significantTerms UDF allows the user to query data for significant terms within the results. This identifies terms that appear an unusually frequently in a specific result set relative to the index. It can be used to identify anomalies in data.
This UDF can be used on tokenized text and multi-value and single-value string fields. Example:
select significantTerms(fieldX), score from collectionName where fieldy = 'my-query' limit 10
timediff The timediff UDF computes the difference, in milliseconds, between to timestamps in the same row. From the session_rollup job, the timediff function computes the difference between the current time and the last activity in a session. click_pos The click_pos UDF computes either a reciprocal rank or a raw click position (using a 0-based index) of a document in a page of results. This UDF is used to compute the mean reciprocal rank (MRR) for experiments. For example, given the following list of documents and a doc ID, the click_pos UDF will return 2:
docs: a,b,c,d
doc ID: c
concat_text The concat_text UDF combines multivalued text fields coming from Solr into a field with a single value delimited by spaces. This UDF is useful when a field returned from Solr uses the _txt suffix, which indicates a multivalued text field.