Fusion Server 5.3.0 Release Notes

Subscribe

Release date: 18 November 2020

Component versions:

Solr 8.6.3

ZooKeeper 3.5.7

Spark 2.4.5

Solr is updated from version 8.4.1 to 8.6.3.

New Features

Fusion

Data Models

Data models simplify the process of getting started with Fusion by providing pre-configured objects to reduce the effort spent on basic starting tasks. This helps keep documents consistent between datasources and intuitive to the object’s type.

Access data models in the Fusion UI by navigating to Indexing > Data Models.

Data Models in Fusion UI

Some connectors include built-in data models as a standard component. Others require you to manually create data models.

Audit Logs

Audit Logs are added to the DevOps Center’s Log Viewer. Audit logs provide you with a resource for tracking actions within Fusion, including the date, time, user responsible, and more.

Audit logs

Subscriptions

Subscriptions in Fusion allow you to create and configure subscriptions using Apache Pulsar.

In Fusion 5.3.0, Subscriptions are included in the Fusion UI under Indexing > Subscriptions.

Subscriptions Panel

See Subscriptions UI for more information.

Fusion AI

New features for Smart Answers

Milvus integration

Fusion 5.3 extends support for semantic search using vectors and embeddings by integrating with Milvus, a highly scalable embeddings engine that allows Fusion to streamline the methodologies that use deep learning for question/answer solutions like Smart Answers, recommendations based on similarity, and regular search.

A number of new components are introduced to manage and utilize Milvus:

New deep learning models

In Fusion 5.3, we are refreshing our deep learning models methodologies to be used in training and inference for semantic search-based Smart Answers. The following models are new in this release:

  • bpe_en_300d_10K

  • bpe_en_300d_200K

  • bpe_ja_300d_100K

  • bpe_ko_300d_100K

  • bpe_zh_300d_50K

  • pe_multi_300d_320K

The bpe_{language}_{dim_size}_{vocab_size} models are general pre-trained BPEmb embeddings that are available for different languages, including Chinese/Japanese/Korean (CJK) languages and multilingual. These are also useful in scenarios when vocabulary is very big or when the data might contain a lot of misspellings.

  • distilbert_en

  • distilbert_multi

These are distilled, performance-optimized versions of BERT models designed to be used on scale. Available for English language and multilingual applications.

  • biobert_v1.1

This is a BERT model that was pre-trained on large-scale biomedical corpora which makes it more suitable for biomedical domain applications.

Answer Extraction

To enhance how our Smart Answers customers interact with results sets that are composed of large documents, Fusion 5.3 adds Answer Extraction, allowing you to extract a paragraph, sentence, or phrase to answer questions.

When a large document is presented as a result to a query, Answer Extraction extracts the sentences out of the document that are most similar to the query content. To configure this feature, you train a model that gets deployed at the end of the Smart Answers query pipeline stage, after the resulting set of large documents is returned from Solr for final ranking. The model outputs the sentences from each document that are the most similar to the query.

Answer Extraction workflow

The Answer Extraction model is now available from the Lucidworks official Docker to be deployed as a Seldon model. See Extract Short Answers from Longer Documents for detailed configuration steps.

New Seldon model: spaCy

The spaCy NER and POS model that formerly shipped with Fusion is now available only from the Lucidworks official Docker to be deployed as a Seldon model.

The new Seldon model is compatible with Fusion 5.1+ and existing NLP Annotator stages.

See these instructions for deploying the new model.

The new Trending Recommender job analyzes signals to measure customer engagement over time. Use this job to identify spikes in popularity for specific documents or products, then display those items to your users or analyze the trends for business purposes. You can configure any time window, such as daily, weekly, or monthly.

For detailed steps to configure this job, see Identify Trending Documents or Products.

Build Training Data job

The new Build Training Data job constructs the training data required for query-time classification, that is, predicting the categories most likely to satisfy a query.

Query-time classification workflow

For detailed configuration steps, see Classify New Queries.

Connectors

Remote connector support

Remote connector support returns in Fusion 5.3.0. See Use a Remote Connector with Pulsar Proxy for more information.

Windows Share SMB 2/3 (V2)

The Windows Share connector can access content in a Windows Share or Server Message Block (SMB 2 and 3 protocols)/Common Internet File System (CIFS) filesystem. Available for Fusion Server version 5.3 and later.

For more information, see the Windows Share SMB 2/3 connector reference documentation.

Google Drive (V2)

The Google Drive connector is used to index the documents in a Google Drive account.

For more information, see the Google Drive reference documentation.

JDBC (V2)

For more information, see the JDBC connector reference documentation.

Amazon AWS S3 (V2)

The S3 connector crawls items in a single bucket. You must specify the bucket name and AWS region in which that bucket is located.

You may crawl specific items in a bucket. If no items are specified, the entire bucket will be crawled.

For more information, see the Amazon AWS S3 connector reference documentation.

Predictive Merchandiser

JSON Blob rule type

A new rule type, JSON Blob, is added. This rule type allows you to pass arbitrary JSON blobs to your frontend when a rule fires:

JSON Blob Rule Type

Product Detail Page template

A new template, Product Detail Page, is added. This template allows you to configure what details and zones are displayed when a user views a product’s details. To configure this template, navigate to Templates and edit Product Detail Page. You can also configure this template visually in the Merchandiser screen by hovering over a product, clicking the Product Detail Page button, clicking the Start Task button, and clicking the Edit Template button.

Improvements

Fusion

  • The Field Mapping index pipeline stage is redesigned to be easier to use. Functions of the Field Mapping stage are more accurately and adequately described.

    Field Mapping pipeline stage improvement

    Tip
    Field mapping rules are now applied in the order in which they are defined within each operation type. See the Field Mapping index pipeline stage configuration page for more details.
  • A new option is added to enable ephemeral users on the following realms: OpenID Connect, SSO Trusted HTTP, JWT, LDAP, and SAML. This improves performance for Fusion deployments with a large number of user accounts. To enable, set ephemeralUsers to true.

    Note
    Ephemeral users are not created in Zookeeper, so the autoCreateUsers setting has no meaning if ephemeralUsers are enabled. The information needed to create a user must be taken from realm configuration and from IdP.
  • V2 connector configurations are now persisted with default values.

  • Improved error messages for Fusion SQL.

  • Index and query stage schema is now returned alphabetically from /api/index-stages/schema.

  • String templates are now supported in REST stages and collection parameters for the rules and text tagger stages.

Fusion UI

In addition to minor improvements, several notable improvements are made to the Fusion UI.

Fusion DSL integration

Fusion Domain Specific Language (DSL) provides expected search results as a JSON response in a way that reduces search query complexity for the user.

In Fusion 5.3.0, DSL is integrated with the UI:

  • Query Pipelines - Select which search modes to use with your query pipeline: DSL, Legacy (Solr), or All (both). Your selection is reflected in the Query Workbench.

    Search mode selector

  • Query Workbench - While entering queries in the Query Workbench, click the active search mode to toggle between DSL and Solr.

    Toggle QWB Search Mode

    Note
    This option is only available if your pipeline supports All search modes.

    Change your View As setting to JSON to how the response results change.

    View as JSON

  • Query Pipeline Stages - Not all query pipeline stages support DSL. If you add an incompatible stage while in DSL mode, the stage will be flagged accordingly:

    Incompatible DSL stage

    However, a new stage is added to extend DSL support to incompatible stages. Click Add a Stage and select the DSL to Legacy Parameters stage. When loaded before the incompatible stage(s), this stage converts the DSL request to legacy parameters so it can still be processed by the incompatible stage.

    DSL to Legacy Parameters stage

See the DSL documentation for more details on what you can do.

Minimap

A minimap is added to the bottom of the Fusion UI, allowing you to quickly move between open panels:

  • Click one of the items to move your focus to the corresponding panel

  • Close panels by hovering over an item and clicking the X that appears

  • Close all panels by clicking the Close all button: Close all button

    Tip
    Panels with unsaved changes will not be closed.
Toggle Sampling in the Index Workbench

The Index Workbench now features a Sampling toggle at the bottom of the results.

  • When set to On, the results update with every change you make: Sampling on

  • When set to Off, the results no longer update, but a Refresh button appears to manually update the results: Sampling off

Pin result fields

You can now pin result fields in the Index Workbench by hovering over a field and clicking the Pin icon that appears.

Pinned result fields

Pinned results are moved to the top of the document fields list, allowing you to quickly find the fields most relevant to the tweaks you’re working on.

Fusion AI

  • You can now view, edit, publish, and delete unpublished rules created by other users in the Rules Editor.

  • AI jobs can now read and write from GCS directly. Meaning, the customer engagement data aka signals do not necessarily have to be stored in Solr for the jobs to work. This development is primarily to avoid situations where Solr is unable to keep up with the write requests from the jobs. GCS handles high speed, high scale writes efficiently.

Fusion SQL

Enhancements in the Fusion SQL integration for notebooks and visualization platforms:

  • Jupyter - Learn about connectivity set-up and usage examples of Fusion SQL with Jupyter notebooks in Use Jupyter with Fusion SQL.

  • Apache Superset - Available documentation for connectivity set-up and usage of Fusion SQL with SuperSet BI platform in Use Superset with Fusion SQL

Predictive Merchandiser

  • Predictive Merchandiser templates now support staging and publishing actions similar to rules. This allows you to draft templates and zones without affecting the production experience.

  • Predictive Merchandiser zones adds a new configuration option, UI Treatment. This field allows for arbitrary text in the response when rendering this zone. This text is used by your frontend, when configured to do so, to determine how the zone is displayed. By utitilizing this field, you can use the same zone more than once on the same page but display the results differently.

    UI Treatment

  • Predictive Merchandiser zones adds a new configuration option, Omit Filters From Query. When set to true, this zone ignores all fq filter parameters. For example, if you have a category landing page that is filtering results by category, this configuration option allows you to display items outside of that category.

    Omit Filters from Query

  • When attempting to navigate away from an unsaved template, a confirmation modal is now displayed to help prevent the loss of progress.

  • Miscellaneous UI improvements.

Bug fixes

Fusion

  • You can now easily select text in the log view without inadvertently collapsing the row.

  • You can no longer save invalid field mapping configurations. Errors are now reported for invalid configurations.

  • Fixed a bug that prevented jobs from saving multiple times.

  • Fixed incorrect error messages in the Blob Store.

  • Reduced the number of requests sent to the Kubernetes API service to reduce the chance of exceeding the Kubernetes call limit.

  • Fixed an issue with the Query Workbench that sometimes caused a Java heap space error when a query returned a large number of documents (over 1 million).

Fusion AI

  • Fixed a bug in the Rules Editor that prevented the pagination from working as expected.

Connectors

  • Fixed a bug that sometimes prevented V2 connectors that use multiple phases to fetch documents in later phases.

Predictive Merchandiser

  • A bug is fixed that resulted in different filter behaviors between the API and the UI. Filters now apply to all zones.

  • Fixed a bug that occasionally caused all rules to show as pinned rules if a multiple zones shared a pinned rule.

  • Fixed a bug that prevented rules with disallowed symbols from showing when the Show Applied Rules button was clicked in the Product Details view. Symbols are now encoded.

  • Fixed a bug that prevented product groups from opening when facets were selected.

  • Fixed an issue with Predictive Merchandiser templates that resulted in zones and templates remaining available after the associated app was deleted and recreated.

Other changes

  • The sql-jdbc service, which serves the stateful JDBC connections, is now restricted to a single pod.

Notable documentation updates

Connectors download and installation

The connectors documentation is updated to address user feedback and clarify how connectors are downloaded and installed in each version of Fusion. The changes include:

Business Rules

The Business Rules documentation is totally revamped for all versions of the Fusion docs. The changes include:

API reference docs

The REST API reference docs are updated to be more accurate and make topics easier to discover. Changes include:

  • A reorganized index page that makes API topics easier to find.

  • The sidenav is reorganized to reflect the changes.

  • Descriptions are added and updated for each API.

  • The Managed Search API references are moved to the Managed Search documentation.

To get started reading the improved API documentation, see REST API Reference.

Deprecations

  • The JDBC/SQL V1 connector is deprecated in 5.3 and will be removed in 5.4. Use the JDBC V2 connector instead.

Removals

  • The System > Solr Clusters panel is removed from the Fusion UI.

Known issues

  • When adding gRPC authentication headers, the Machine Learning Index Stage fails to retrieve the ServiceAccount JWT value while checking SecurityContextHolder.getContext().getAuthentication() if ServiceAccount is set in thread context. This results in authentication errors when processing messages coming from Pulsar Subscriptions.