id
field in the configuration file. Required field.
trainingCollection
field in the configuration file. Required field.outputCollection
field in the configuration file. Required field.solr
, parquet
, and orc
. This is the dataFormat
field in the configuration file. Required field.true
), only outliers are saved in the job’s output collection. If not selected (set to false
), the entire dataset is saved in the job’s output collection. This is the outputOutliersOnly
field in the configuration file. Optional field.
field1:weight1
,field2:weight2
, etc. This is the fieldToVectorize
field in the configuration file. Required field.uidField
field in the configuration file. Required field.outlierGroupIdField
field in the configuration file. Optional field.outlierGroupLabelField
field in the configuration file. Optional field.freqTermField
field in the configuration file. Optional field.distToCenterField
field in the configuration file. Optional field.<1.0
indicate a percentage, 1.0
is 100 percent, and >1.0
indicates the exact number. This is the maxDF
field in the configuration file. Optional field.<1.0
indicate a percentage, 1.0
is 100 percent, and >1.0
indicates the exact number. This is the minDF
field in the configuration file. Optional field.numKeywordsPerLabel
field in the configuration file. Optional field.analyzerConfig
field in the configuration file. Optional field.SELECT * from spark_input
registers the input data as spark_input
. This is the sparkSQL
field in the configuration file.solr
and parquet
. This is the dataOutputFormat
field in the configuration file.partitionCols
field in the configuration file.trainingDataFilterQuery
field in the configuration file.parameter name:parameter value
options to use when reading input from Solr or other sources. This is the readOptions
field in the configuration file.
parameter name:parameter value
options to use when writing output to Solr or other sources. This is the writeOptions
field in the configuration file.
trainingDataFrameConfigOptions
field in the configuration file.trainingDataSamplingFraction
field in the configuration file.randomSeed
field in the configuration file.sourceFields
field in the configuration file.outlierK
field in the configuration file.<1.0
indicate a percentage, 1.0
is 100 percent, and >1.0
indicates the exact number. This is the outlierThreshold
field in the configuration file.-1
turns off normalization. This is the norm
field in the configuration file.Spark Job ID
is used. This is the modelId
field in the configuration file.