Fusion Server 4.2.0 Release Notes
Release date: 28 February 2019
Component versions:
Solr 7.5 |
ZooKeeper 3.4.13 |
Spark 2.3.1 |
Jetty 9.4.11.v20180605 |
Ignite 2.3.0 |
More information about support dates can be found at Lucidworks Fusion Product Lifecycle.
See also the App Studio 4.2.0 release notes.
New features
-
Dynamic, real-time system visualizations with the new DevOps Center
The DevOps Center generates real-time dashboards for visualizing metrics throughout your Fusion system, plus a log viewer where you can explore events and focus on specific event types. You can also export metrics and events for any timeframe, in CSV format, for external analysis.
-
Faster query pipeline performance with asynchronous stage processing
Query pipelines can now be forked for parallel processing so that faster stages can proceed while requests to external resources wait for their responses.
In this release, you can enable asynchronous execution in the following query stages:
A new Merge Async Results stage joins the forked pipeline before the final Solr stage.
For complete instructions, see Query Pipelines.
-
Sitecore support
The new Sitecore connector provides full crawl and incremental crawl support for versions 8.x and 9.x of the popular Sitecore CMS, indexing both document content and metadata.
-
A new Dropbox connector supports the latest Dropbox API.
-
This is a simple parser for Solr’s various update formats (XML, CSV, JSON, and javaBin).
-
New query pipeline stages
These new query pipeline stages support new Fusion AI 4.2.0 features as well as the asynchronous pipeline processing described above.
-
Query rewriting stages:
-
Response rewriting stages:
-
Merge Async Results stage for asynchronous processing (see above).
The default query pipeline now includes the new Text Tagger, Apply Rules, and Modify Response with Rules stages.
-
-
New collections
-
query_rewrite_staging
Rules and certain Spark job results are written to this collection temporarily. See Query Rewriting for details.
-
query_rewrite
Query pipelines read rules and job results from this collection in order to perform query rewriting. Docs are migrated to this collection from the
query_rewrite_staging
collection. -
job_reports
Job histories are now written to this collection.
-
_user_prefs
This collection stores App Studio social data, such as user tags, bookmarks, and so on.
-
system_monitor
The new system metrics used for the DevOps Center are written to this collection. These new metrics replace the metrics previously written to the
system_metrics
collection.
-
-
New REST APIs and endpoints
-
New Custom Rules API
-
The Webapps API is deprecated in favor of the new Webapps Appkit API. -
/index-pipelines/{id}/collections/{collection}/indexMultiple
submits a set of documents to an index pipeline. -
/spark/reports/{job}
gets the job results from a specific job. -
/webapps/{id}/war/manifest
gets the.war
file manifest for specified Web app.
-
Improvements
-
Broader access to the Object Explorer
A new Explore button appears in configuration panels for Fusion objects that can be viewed in the Object Explorer. Click the button to see the object’s relationships to other Fusion objects.
-
Use external management tools to control Fusion configurations
Many of the values in
conf/fusion.properties
can now be set using environment variables, enabling you to set them usingsystemd
, Docker, Kubernetes, and so on. Default values are also provided. For example, inapi.port = ${API_PORT:-8765}
, the default value is 8765 unlessAPI_PORT
is defined.Note that
ZOOKEEPER_PORT
cannot not be used, and value ofzookeeper.port
infusion.properties
must be the same as the value ofclientPort
inconf/zookeeper/zoo.cfg
. -
Improved connectors functionality
-
Improved incremental crawl performance across all connectors with MapDB upgrade.
-
Kerberos support in the Jive connector.
-
Web connector now supports website authentication credentials files in container path.
-
javascript evaluation. for crawling websites.
-
SharePoint V1 connector now have bulk start link URL list import.
-
Confluence connector now supports API token based authentication.
-
The SMB2/3 Connector has improved support for crawling distributed file systems.
-
The SMB2/3 Connector now saves the original file path and redirected file paths when crawling distributed file systems.
-
JDBC V1 connector has improved settings for managing index commits during crawls.
-
The OneDrive connector can now crawl user-specific drives.
-
Some connectors now give you the option to index metadata about documents that were discarded because they were too large or too small using the new
f.index_items_discarded
/Index discarded document metadata parameter. A new field,_lw_skipped_reason_s
, indicates the reason that the document was skipped during indexing. The new key is available for these connectors:-
Web: default = false
-
Sharepoint: default = true
-
Sharepoint Online: default = true
-
SMB2: default = false
-
Box: default = false
-
Google Drive: default = false
-
Dropbox: default = false
-
Local Filesystem: default = false
-
-
Several connectors have a new Enable Plugin Parsing/
pluginParsing
parameter. When it is enabled, the connector parses raw content before streaming it to the index pipeline. The following connectors support this parameter:-
Local Filesystem
-
OneDrive
-
Sitecore
-
Windows Share (SMB 2/3)
-
-
-
The dashboards framework has been upgraded to Banana 1.6.23. See the Banana release notes.
-
For tighter security, CORS is now disallowed by default. You can enable it, if needed, by editing the
proxy.corsAllowOrigin
property inconf/fusion.properties
.
Other changes
-
The Synonyms UI is no longer available. See the new Synonym Detection feature, available with a Fusion AI license.
-
The
synonyms
collection has been replaced with the newquery_rewrite
collection.
-
The Recommendations API is deprecated and will be removed in a future release.
Known issues
-
When under load, the Fusion proxy service can occasionally become stuck, causing user authentication to fail. This is the result of the proxy
InputStream
failing to close properly.An upgrade to Fusion 4.2.4 is required to fix this issue. See Upgrade Fusion.
-
Connectors
-
Repeatedly stopping a V2 datasource job and clearing the datasource may result in an out-of-memory condition. To recover from this state, restart the
connectors-rpc
process:fusion/4.2.0/bin/connectors-rpc restart
-
In a cluster environment, after installing a V2 connector (Onedrive connectors) where multiple connectors-rpc nodes are running, the connector may only install and run on one node instead of propagating to all nodes. If this happens, restart all nodes and then re-install the connector.
-
Although the Web V1 connector’s default value for
crawlDBType
/Crawl database type is "in-memory", this can cause an out-of-memory condition when crawling large sites. Change the value to "on-disk". -
If new items are not picked up when recrawling a Box.com folder, delete the records in the
system_box_distributed_crawl
collection, like this:curl “http://localhost:8983/solr/system_box_distributed_crawl/update?commit=true” -H “Content-Type: text/xml” --data-binary ‘<delete><query>*:*</query></delete>’
Then run the datasource job again.
-
The Local Filesystem connector may slow down while crawling an empty folder.
-
With the SMB2/3 connector, if multiple start links point to some of the same data, then the data is indexed multiple times. Remove redundant start links and use only the "parent" link.
-
FS (V2) connector does not save Item metadata in Crawldb. recrawls do not work as expected (some items will be missing or not be evaluated on the next crawl).
-
When re-installing connector using same plugin id but different file name a deadlock condition may occur resulting in a timeout error.
-
-
Fusion UI
-
In the Query Workbench, some document fields may not appear in the dropdown list of fields for faceting. To work around this, enter the name of the field in the text box and press Return. If the field exists in the dataset, it will be added as a facet even though it does not appear in the list.
-
When using Compare mode in the Query Workbench, configuring the list of display fields may change the display in both panels instead of only the working panel.
-
After logging out using Chrome version 73.0.3683.75, the login page may not automatically appear. To work around this, do a hard refresh (by holding down the CTRL key while clicking the Reload button).
-
In the Query Rewriting UI, creating a query rewrite with the same name as an existing query rewrite deletes the existing one. Be sure to create new query rewrites using unique names.
-
Jobs may display incorrect information about their current status or the time at which they last ran. To work around this, use the Jobs API to verify a job’s status and history.
-
After you upload a
.war
file to the App Studio interface, the View Published UI button disappears from the App Studio configuration panel. To restore this button, click Edit, then click Return to Fusion. -
New index pipelines may not appear in the Index Workbench until you do a hard refresh (by holding down the CTRL key while clicking the Reload button).
-
In the Query Rewriting UI, after selecting multiple business rules where some rules have tags, adding more tags to the selected rules deletes their existing tags. To work around this, add tags to individual rules instead of adding them in bulk.
-