Fusion AI 4.2.0 Release Notes
Release date: 28 February 2019
Component versions:
Solr 7.5 |
ZooKeeper 3.4.13 |
Spark 2.3.1 |
Jetty 9.4.11.v20180605 |
Ignite 2.3.0 |
More information about support dates can be found at Lucidworks Fusion Product Lifecycle.
New features
-
Smarter relevancy with query rewriting
Query rewriting turns ineffective queries into more relevant results by automatically correcting misspelled terms, expanding queries to include synonyms, boosting known phrases, and applying your own business rules.
4.2.0 ties together existing query rewriting features and adds new features to make query rewriting easier to configure:
-
Built-in support for business rules
Business rules provide versatile, fine-grained control over query/response rewriting. You can create, edit, deploy, and organize rules using this new Query Rewriting UI. New rules-based query stages are automatically updated based on the changes you make in the rules editor.
-
Now you can view, create, modify, and publish multiple types of query rewritings by navigating to Relevance > Query Rewriting in the Fusion UI. A new Simulator lets you preview search results to test all enabled query rewriting strategies, including unpublished query rewrites.
-
New Synonym detection job
The new Synonym and Similar Queries Detection job takes input from signals data, the Token and Phrase Spell Correction job, the Phrase Extraction job, and keyword lists in the blob store to automatically detect and store synonyms for use in query rewriting.
-
New
query_rewrite_staging
andquery_rewrite
collections are dedicated to AI-generated content that can be used to rewrite queries:-
Head/Tail analysis job results
-
Phrase Extraction job results
-
New query pipeline stages apply the rules and results from the
query_rewrite
collection: -
A new
rules_simulator
query profile allows you to experiment with rules and other query rewrites in the_query_rewrite_staging
collection using the Simulator before deploying them to the_query_rewrite
collection.
-
-
Refine the final search results with response rewriting
Similar to query rewriting, response rewriting can apply machine learning, business rules, or other criteria to Solr’s response, refining the final set of search results before Fusion sends them to the search application. Response rewriting can be performed using rules and a set of new query pipeline stages that fall into two categories:
-
Distribute clicks more evenly among the top N results
These stages act on the whole set of documents in the response:
-
"De-bias" results by shuffling the top N results randomly.
-
"De-bias" results by swapping the search results at any two positions, such as positions 1 and 2, positions 3 and 4, and so on.
-
-
Manipulate specific search result items
These stages act on individual documents in the response:
-
Response Document Exclusion stage
Drop all documents that match all of the specified rules.
-
Response Document Field Redaction stage
Remove fields that match a regular expression from a document.
-
Modify Response with Rules stage
Apply rules to the response.
-
-
-
More Natural Language Processing (NLP) power in pipelines
This release includes a new NLP Annotator index stage and query stage that leverage the popular John Snow NLP library for Spark, introducing these new NLP features in addition to the existing Named Entity Extraction (NER) functionality:
-
Sentence detection
-
Part-of-Speech (POS) tagging
See Natural Language Processing for an overview of Fusion AI’s NLP capabilities.
-
See the Fusion Server 4.2.0 release notes and the App Studio 4.2.0 release notes for other changes.
Improvements
-
Experiments can now be configured as multi-armed bandits, by selecting the new Automatically Adjust Weights Between Variants option when setting up the experiment.
-
A Part-of-Speech (POS) model is now available in the blob store by default, as
en-pos-maxent.bin
, for use by the Phrase Extraction job.
-
When a new app is created, these jobs are now automatically created and scheduled to run daily, beginning 15 minutes after app creation, in the following order:
-
This job runs if the Token and Phrase Spell Correction job succeeds.
-
Synonym and Similar Queries Detection
This job runs if the Phrase Extraction job succeeds.
Known issues
-
In Underperforming Query Rewriting and Misspelling Detection, new or modified query rewrites cannot be saved when any of their values include trailing or leading whitespace. Remove any trailing or leading whitespace to save the query rewrite.
-
The Query Rewriting UI’s search feature joins multiple search terms using OR instead of AND. For example, searching for a rule called "Test 1" returns "Test 1", "Test 2", and "1 Rule".
-
The Query Rewriting UI’s search feature may return a
Problem with underlying storage
if your query is*:*
followed by an additional query term. This is an invalid query; use only*:*
to search for all query rewrites. -
In Underperforming Query Rewriting, job-generated query improvements do not preserve the original query’s uppercase and lowercase characters. For example, an underperforming query containing "brandX" may be rewritten to contain "brandx". You may need to manually modify the query improvements to preserve the correct cases.
-
After selecting multiple business rules where some rules have tags, adding more tags to the selected rules deletes their existing tags. To work around this, add tags to individual rules instead of adding them in bulk.
See also the Fusion Server 4.2.0 known issues.