Trending Recommender Jobs

Table of Contents

Configuration properties

The Trending Recommender job analyzes signals to measure customer engagement over time. Use this job to identify spikes in popularity for specific items or queries, then display those items to your users or analyze the trends for business purposes. You can configure any time window, such as daily, weekly, or monthly.

Input

signals (the COLLECTION_NAME_signals collection by default)

Output

Configuration properties

Trending Recommender

id - stringrequired

The ID for this Spark job. Used in the API to reference this job. Allowed characters: a-z, A-Z, dash (-) and underscore (_). Maximum length: 63 characters.

<= 63 characters

Match pattern: [a-zA-Z][_\-a-zA-Z0-9]*[a-zA-Z0-9]?

sparkConfig - array[object]

Spark configuration settings.

object attributes:{key required : {
display name: Parameter Name
type: string
}value : {
display name: Parameter Value
type: string
}}

trainingCollection - stringrequired

Solr Collection containing labeled training data

>= 1 characters

fieldToVectorize - string

Fields to extract from Solr (not used for other formats)

>= 1 characters

dataFormat - stringrequired

Spark-compatible format that contains training data (like 'solr', 'parquet', 'orc' etc)

>= 1 characters

Default: solr

trainingDataFrameConfigOptions - object

Additional spark dataframe loading configuration options

trainingDataFilterQuery - string

Solr query to use when loading training data if using Solr

Default: *:*

sparkSQL - string

Use this field to create a Spark SQL query for filtering your input data. The input data will be registered as spark_input

Default: SELECT * from spark_input

trainingDataSamplingFraction - number

Fraction of the training data to use

<= 1

exclusiveMaximum: false

Default: 1

randomSeed - integer

For any deterministic pseudorandom number generation

Default: 1234

outputCollection - string

Solr Collection to store model-labeled data to

dataOutputFormat - string

Spark-compatible output format (like 'solr', 'parquet', etc)

>= 1 characters

Default: solr

sourceFields - string

Solr fields to load (comma-delimited). Leave empty to allow the job to select the required fields to load at runtime.

partitionCols - string

If writing to non-Solr sources, this field will accept a comma-delimited list of column names for partitioning the dataframe before writing to the external output

writeOptions - array[object]

Options used when writing output to Solr or other sources

object attributes:{key required : {
display name: Parameter Name
type: string
}value : {
display name: Parameter Value
type: string
}}

readOptions - array[object]

Options used when reading input from Solr or other sources.

object attributes:{key required : {
display name: Parameter Name
type: string
}value : {
display name: Parameter Value
type: string
}}

refTimeRange - integerrequired

Number of reference days: number of days to use as baseline to find trends (calculated from today)

targetTimeRange - integerrequired

Number of target days: number of days to use as target to find trends (calculated from today)

numWeeksRef - number

If using filter queries for reference and target time ranges, enter the value of (reference days / target days) here (if not using filter queries, this will be calculated automatically)

sparkPartitions - integer

Spark will re-partition the input to have this number of partitions. Increase for greater parallelism

Default: 200

countField - stringrequired

Field containing the number of times an event (e.g. click) occurs for a particular query; count_i in the raw signal collection or aggr_count_i in the aggregated signal collection.

>= 1 characters

Default: aggr_count_i

referenceTimeFilterQuery - string

Add a Spark SQL filter query here for greater control of time filtering

targetFilterTimeQuery - string

Add a Spark SQL filter query here for greater control of time filtering

typeField - stringrequired

Enter type field (default is type)

Default: aggr_type_s

timeField - stringrequired

Enter time field (default is timestamp_tdt)

Default: timestamp_tdt

docIdField - stringrequired

Enter document id field (default is doc_id)

Default: doc_id_s

types - stringrequired

Enter a comma-separated list of event types to filter on

Default: click,add

recsCount - integerrequired

Maximum number of recs to generate (or -1 for no limit)

Default: 500

type - stringrequired

Default: trending-recommender

Allowed values: trending-recommender