Fusion Server 4.2.0 Release Notes

Release date: 28 February 2019

Component versions:

Solr 7.5

ZooKeeper 3.4.13

Spark 2.3.1

Jetty 9.4.11.v20180605

Ignite 2.3.0

New features

Improvements

  • Broader access to the Object Explorer

    A new Explore button appears in configuration panels for Fusion objects that can be viewed in the Object Explorer. Click the button to see the object’s relationships to other Fusion objects.

    Explore button

  • Use external management tools to control Fusion configurations

    Many of the values in conf/fusion.properties can now be set using environment variables, enabling you to set them using systemd, Docker, Kubernetes, and so on. Default values are also provided. For example, in api.port = ${API_PORT:-8765}, the default value is 8765 unless API_PORT is defined.

    Note that ZOOKEEPER_PORT cannot not be used, and value of zookeeper.port in fusion.properties must be the same as the value of clientPort in conf/zookeeper/zoo.cfg.

  • Improved connectors functionality

    • Improved incremental crawl performance across all connectors with MapDB upgrade.

    • Kerberos support in the Web connector and the Jive connector.

    • Web connector improvements to character encoding during javascript evaluation.

    • Web connector now supports website authentication credentials files in container path.

    • Web connector now has option to use javascript evaluation for website authentication without using javascript evaluation. for crawling websites.

    • Web and SharePoint connectors now have bulk start link URL list import.

    • Confluence connector now supports API token based authentication.

    • The SMB2/3 Connector has improved support for crawling distributed file systems.

    • The SMB2/3 Connector now saves the original file path and redirected file paths when crawling distributed file systems.

    • JDBC connector has improved settings for managing index commits during crawls.

    • The OneDrive connector can now crawl user-specific drives.

    • Some connectors now give you the option to index metadata about documents that were discarded because they were too large or too small using the new f.index_items_discarded/Index discarded document metadata parameter. A new field, _lw_skipped_reason_s, indicates the reason that the document was skipped during indexing. The new key is available for these connectors:

      • Web: default = false

      • Sharepoint: default = true

      • Sharepoint Online: default = true

      • SMB2: default = false

      • Box: default = false

      • Google Drive: default = false

      • Dropbox: default = false

      • Local Filesystem: default = false

    • Several connectors have a new Enable Plugin Parsing/pluginParsing parameter. When it is enabled, the connector parses raw content before streaming it to the index pipeline. The following connectors support this parameter:

      • Local Filesystem

      • OneDrive

      • Sitecore

      • Windows Share (SMB 2/3)

  • For tighter security, CORS is now disallowed by default. You can enable it, if needed, by editing the proxy.corsAllowOrigin property in conf/fusion.properties.

Other changes

  • The Synonyms UI is no longer available. See the new Synonym Detection feature, available with a Fusion AI license.

  • The synonyms collection has been replaced with the new query_rewrite collection.

Known issues

  • Connectors

    • Repeatedly stopping a V2 datasource job and clearing the datasource may result in an out-of-memory condition. To recover from this state, restart the connectors-rpc process:

      fusion/4.2.0/bin/connectors-rpc restart
    • In a cluster environment, after installing a V2 connector (Local Filesystem, Sitecore, or Onedrive connectors) where multiple connectors-rpc nodes are running, the connector may only install and run on one node instead of propagating to all nodes. If this happens, restart all nodes and then re-install the connector.

    • Although the Web connector’s default value for crawlDBType/Crawl database type is "in-memory", this can cause an out-of-memory condition when crawling large sites. Change the value to "on-disk".

    • If new items are not picked up when re-crawling a Box.com folder, delete the records in the system_box_distributed_crawl collection, like this:

      curl “http://localhost:8983/solr/system_box_distributed_crawl/update?commit=true” -H “Content-Type: text/xml” --data-binary ‘<delete><query>*:*</query></delete>’

      Then run the datasource job again.

    • The Local Filesystem connector may slow down while crawling an empty folder.

    • With the SMB2/3 connector, if multiple start links point to some of the same data, then the data is indexed multiple times. Remove redundant start links and use only the "parent" link.

    • FS (V2) connector doesn’t save Item metadata in Crawldb - recrawls don’t work as expected (some items will be missing or not be evaluated on the next crawl).

    • When re-installing connector using same plugin id but different file name a deadlock condition may occur resulting in a timeout error.

  • Fusion UI

    • In the Query Workbench, some document fields may not appear in the dropdown list of fields for faceting. To work around this, enter the name of the field in the text box and press Return. If the field exists in the dataset, it will be added as a facet even though it does not appear in the list.

      Enter a field name in the text box

    • When using Compare mode in the Query Workbench, configuring the list of display fields may change the display in both panels instead of only the working panel.

    • After logging out using Chrome version 73.0.3683.75, the login page may not automatically appear. To work around this, do a hard refresh (by holding down the CTRL key while clicking the Reload button).

    • In the Query Rewriting UI, creating a query rewrite with the same name as an existing query rewrite deletes the existing one. Be sure to create new query rewrites using unique names.

    • Jobs may display incorrect information about their current status or the time at which they last ran. To work around this, use the Jobs API to verify a job’s status and history.

    • After you upload a .war file to the App Studio interface, the View Published UI button disappears from the App Studio configuration panel. To restore this button, click Edit, then click Return to Fusion.

    • New index pipelines may not appear in the Index Workbench until you do a hard refresh (by holding down the CTRL key while clicking the Reload button).

    • In the Query Rewriting UI, after selecting multiple business rules where some rules have tags, adding more tags to the selected rules deletes their existing tags. To work around this, add tags to individual rules instead of adding them in bulk.