Improvements
-
Phrase Extraction job improvements
- The job now trims low-confidence phrases based on likelihood.
-
The job adds a
reviewtag on the result to facilitate the review process based on beginning and ending POS and likelihood. -
The output now connects phrase tokens with an underscore (
_) to make a single token per phrase so that complete phrases can be used as facets. -
New metadata fields:
phrases_countshows how many times the phrases appear in the documents.word_numshows how many words are in the phrase.
- This release includes optimizations to the default Lucene analyzer configuration.
-
Output names are updated for clarity, such as
llrtolikelihoodandngramtophrases.
- The Head/Tail Analysis job now processes large data sets twice as fast.
-
The Ranking Metrics job now correctly accepts values for the
queryPipelinesproperty in the Fusion UI. -
The Solr Query pipeline stage has two new properties to configure signals:
-
responseSignalsEnabledDisable this option to prevent the stage from generating a response signal containing metadata about the response from Solr. Response signals are used by App Insights and experiments.In auto-complete pipelines, disable this option to avoid generating a response signal for each keystroke. -
excludeResponseSignalMatchRulesIfresponseSignalsEnabledis “true”, then you can prevent generating a response signal based on specific parameters in the query, such as to enable response signals in general but to disable them for auto-complete queries.
-
Other changes
-
The Spark driver now cleans up
$FUSION_HOME/var/spark/Spark-workDir-*directories and shaded jars correctly on Windows to prevent excessive disk consumption. - For installations that were upgraded from 3.1.x to 4.1.0, upgrading to 4.1.1 resolves an issue that prevented successful signal aggregations with the Parameterized SQL Aggregation job.