id
field in the configuration file. Required field.signalsCollection
field in the configuration file. Required field.parameter name:parameter value
options to use in this job. This is the sparkConfig
field in the configuration file.
searchLogsPipeline
field in the configuration file.joinKeySignals
field in the configuration file.joinKeySignals
field in the configuration file.property name:property value
options to when loading the search logs collection. This is the searchLogsAddOpts
field in the configuration file.property name:property value
options when loading the signals collection. This is the signalsAddOpts
field in the configuration file.array[string]
filter query to apply when selecting top queries from the query signals in the signals collection. This is the filterQueries
field in the configuration file.topQueriesLimit
field in the configuration file.id
field in the configuration file. Required field.
outputCollection
field in the configuration file. Required field.
inputCollection
field in the configuration file. Required field.rankingExperimentConfig
inputCollection
field in the configuration file. Optional field.rankingExperimentConfig
experimentId
field in the configuration file. Optional field.rankingExperimentConfig
experimentObjectiveName
field in the configuration file. Optional field.rankingExperimentConfig
defaultProfile
field in the configuration file. Optional field.parameter name:parameter value
options to use in this job. This is the sparkConfig
field in the configuration file.
rankingPositionK
field in the configuration file.
true
), the job calculates the ranking metrics per query in the ground truth dataset, and saves the metrics data to the Output collection designated for this job. This is the metricsPerQuery
field in the configuration file.
groundTruthConfig
filterQueries
field in the configuration file.groundTruthConfig
queryField
field in the configuration file.groundTruthConfig
docIdField
field in the configuration file.groundTruthConfig
weightField
field in the configuration file.rankingExperimentConfig
queryPipelines
field in the configuration file.rankingExperimentConfig
docIdField
field in the configuration file.id
field in the configuration file. Required field.
trainingCollection
field in the configuration file. Required field.outputCollection
field in the configuration file. Required field.solr
, parquet
, and orc
. Required field.fieldToVectorize
field in the configuration file. Required field.clusterIdField
field in the configuration file. Required field.freqTermField
field in the configuration file. Optional field.clusterLabelField
field in the configuration file. Optional field.<1.0
indicate a percentage, 1.0
is 100 percent, and >1.0
indicates the exact number. This is the maxDF
field in the configuration file. Optional field.<1.0
indicate a percentage, 1.0
is 100 percent, and >1.0
indicates the exact number. This is the minDF
field in the configuration file. Optional field.numKeywordsPerLabel
field in the configuration file. Optional field.analyzerConfig
field in the configuration file. Optional field.SELECT * from spark_input
registers the input data as spark_input
. This is the sparkSQL
field in the configuration file.solr
and parquet
. This is the dataOutputFormat
field in the configuration file.partitionCols
field in the configuration file.parameter name:parameter value
options to use when reading input from Solr or other sources. This is the readOptions
field in the configuration file.
parameter name:parameter value
options to use when writing output to Solr or other sources. This is the writeOptions
field in the configuration file.
trainingDataFrameConfigOptions
field in the configuration file.trainingDataSamplingFraction
field in the configuration file.randomSeed
field in the configuration file.sourceFields
field in the configuration file.Spark Job ID
is used. This is the modelId
field in the configuration file.id
field in the configuration file. Required field.
trainingCollection
field in the configuration file. Required field.outputCollection
field in the configuration file. Required field.solr
, parquet
, and orc
. This is the dataFormat
field in the configuration file. Required field.true
), only outliers are saved in the job’s output collection. If not selected (set to false
), the entire dataset is saved in the job’s output collection. This is the outputOutliersOnly
field in the configuration file. Optional field.
field1:weight1
,field2:weight2
, etc. This is the fieldToVectorize
field in the configuration file. Required field.uidField
field in the configuration file. Required field.outlierGroupIdField
field in the configuration file. Optional field.outlierGroupLabelField
field in the configuration file. Optional field.freqTermField
field in the configuration file. Optional field.distToCenterField
field in the configuration file. Optional field.<1.0
indicate a percentage, 1.0
is 100 percent, and >1.0
indicates the exact number. This is the maxDF
field in the configuration file. Optional field.<1.0
indicate a percentage, 1.0
is 100 percent, and >1.0
indicates the exact number. This is the minDF
field in the configuration file. Optional field.numKeywordsPerLabel
field in the configuration file. Optional field.analyzerConfig
field in the configuration file. Optional field.SELECT * from spark_input
registers the input data as spark_input
. This is the sparkSQL
field in the configuration file.solr
and parquet
. This is the dataOutputFormat
field in the configuration file.partitionCols
field in the configuration file.trainingDataFilterQuery
field in the configuration file.parameter name:parameter value
options to use when reading input from Solr or other sources. This is the readOptions
field in the configuration file.
parameter name:parameter value
options to use when writing output to Solr or other sources. This is the writeOptions
field in the configuration file.
trainingDataFrameConfigOptions
field in the configuration file.trainingDataSamplingFraction
field in the configuration file.randomSeed
field in the configuration file.sourceFields
field in the configuration file.outlierK
field in the configuration file.<1.0
indicate a percentage, 1.0
is 100 percent, and >1.0
indicates the exact number. This is the outlierThreshold
field in the configuration file.-1
turns off normalization. This is the norm
field in the configuration file.Spark Job ID
is used. This is the modelId
field in the configuration file.