Fusion Server 5.0.0 Release Notes

Release date: 11 September 2019

Component versions:

Solr 8.2

ZooKeeper 3.4.13

Spark 2.3.2

Lucidworks Fusion 5.0 lets customers easily deploy AI-powered data discovery and search applications in a modern, containerized architecture with built-in support for rapidly creating interactive dashboards and modern enterprise applications. Customers can leverage existing models and workflows, or create and deploy new ones quickly using popular tools like Python ML, TensorFlow, scikit-learn, and spaCy.

Kubernetes

Fusion 5.0 introduces Kubernetes (k8s), an open-source system for automating deployment, scaling, and management of containerized applications, as the recommended Fusion deployment model.

Enforcing best practices

With Kubernetes, Fusion users have a common language to declare how Fusion should be installed, configured, and maintained in production. Instead of focusing on internal implementation details, operations engineers will manage Fusion by monitoring how it uses native Kubernetes resources.

Reducing cost of ownership

Fusion with Kubernetes allows operations engineers to run Fusion, reducing the need for specialized training and internal experts.

Simplifying maintenance and upgrades

Kubernetes allows Fusion users to make zero downtime upgrades simple with rolling updates, minimizing the risks associated with changing a live cluster.

Spring Boot

Traditionally, the cross-cutting concerns of Fusion’s various microservices have been handled with a chassis built and maintained by Lucidworks. Now, Fusion 5.0 employs Spring Boot as its microservice chassis.

Whether Fusion 5.0 is being deployed in a Kubernetes, non-Kubernetes, or on-premises environment, Spring Boot allows Fusion to be deployed using the same underlying framework.

New features

  • Fusion 5.0 expands the capabilities of the Fusion SQL Service query syntax significantly. Fusion SQL now offers:

    • Phrase queries: x = 'Lucidworks Fusion'

    • Sloppy phrase queries: x = '("Fusion Server"~1)'

    • Non-phrase queries: x = '(Lucidworks Fusion)'

    • LIKE operator queries: x LIKE 'Luci%'

  • Fusion 5.0 introduces multiple usability improvements in the Fusion UI. Notable improvements include:

    • Long lists now load incrementally with a "Load More" button.

    • App microservices now have service indicators.

      Fusion UI Service Indicators
    • Stacktrace logs can now be downloaded directly from Fusion UI.

  • Updates to the Fusion SQL service query operations change the default behavior of sorting query results by the documents' Solr score.

    • By default, query operations now return random samples of the matching results. Random sampling can also be utilized with rand(), for example: select * from tableA order by rand() desc limit 5000.

      This change makes it easier to use the jdbc Solr streaming expression to query Fusion SQL. Results from these queries can then be analyzed with Solr math expressions.

    • A User Defined Function (UDF), score(), has been introduced to allow users to sort by a document’s Solr score. This was the previous default behavior. The UDF must be aliased to work correctly, for example: SELECT id, score() as doc_score.

  • The SharePoint Online connector now supports app-only authentication, allowing you to associate searches with an app client id and reduce rate limiting.

Improvements

  • API services can be discovered and connected with external services. For example:

    curl "http://localhost:8765/api/v1/objects/proxyApps"

  • In previous versions of Fusion, image metadata could not be parsed unless the images were parsed by enabling the Include Images option. Now, image metadata can be parsed and indexed separately from the images.

  • A new API gateway, /webapps, is introduced for the purpose of deploying WAR files. For example, if a webapp myapp.war is deployed, it is accessible at /webapps/myapp.war.

  • In the Google Drive connector, the SERVICE ACCOUNT P12 PRIVATE KEY PASSWORD field has been changed from a text field to a password field to improve security.

  • The HTML parser stage now supports jsoup selector methods and can be configured to parse HTML or other data.

  • Ignite-based security trimming can now be enabled or disabled on the Box connector.

  • Fusion 5.0 adds more options for setting the status code from the query pipeline:

    • The JS response object supports a new property, statusCode, which can be set to the desired HTML status code (default 200):

      response.statusCode = 429;
    • Use Jax RS exceptions to set the corresponding HTML status code from custom JavaScript stages for query pipelines:

      var WebApplicationException = Java.type("javax.ws.rs.WebApplicationException");
      throw new WebApplicationException("Back off!", 429);
    • The REST Query index stage can now handle error responses and propagate status codes to a client:

      curl -u admin:password123 -X POST -H 'Content-type: application/json' 'http://localhost:8764/api/query-pipelines' -d '{
          "id": "rpc",
          "stages": [
              {
                  "type": "query-rpc",
                   "resultsLocation": "As Response",
                   "errorHandling": "map",
                   "params": {
                      "uri": "http://localhost:8765/404",
                      "method": "get"
                   }
              },
              {
                  "type": "solr-query"
              }
          ]
      }'
      
      curl -v -u admin:password123 'http://localhost:8764/api/query-pipelines/rpc/collections/system_logs/select?q=*:*&rows=0'
      
      < HTTP/1.1 404 Not Found
      ...
      <title>Error 404 Not Found</title>
      ...
  • Logging of killed nodes is improved by making the log message node-specific.

  • The Web Connector now offers a new property, circularRedirectsAllowed, that toggles whether to block or allow circular redirects. This can be enabled in the UI:

    Allow Circular Redirects

  • The way stopping connector jobs is handled is changed to reduce the amount of memory used.

  • Updated AbstractCrudResource.updateEntity so that a PUT request creates an object if it doesn’t already exist.

  • SMB connector query operations now allow for owner data to be retrieved as a metadata field.

  • Waiting times at the end of connector jobs are reduced.

  • Waiting times for creating new apps are reduced with changes to default jobs and schedules.

  • Forms now support advanced fields and property groups, enabled by the toggle element Advanced.

  • Fusion now isolates CookieStore values for each ChromeWorker to reduce the chance of overwriting cookies.

  • Fetch performance when running Fusion on multiple nodes is improved.

Other changes

  • The Heartbeat Data service is removed.

  • Admin webapps are removed.

Bug Fixes

  • Query stage rule conditions of field value type now work regardless of format, instead of requiring a perfect match.

  • The JSON parser now emits all documents when the id is substituted with another field.

  • The SharePoint Online connector properly deletes all documents from deleted sites when the data is recrawled.

  • Fixed an issue that caused queries in the SQL Service which had 0 results to produce an error.

  • CrawlDB now stores data on-disk by default, as storing data in-memory was prone to crashes.

  • Fixed an issue that caused an error to occur when executing a spell correction job when signals were missing or incompletely digested.

  • The collectMetricsIntervalSecs parameter found in /conf/fusion.properties now works as expected.

  • Predicate expressions on multi-valued fields in the SQL Service now work as expected.

  • Fixed an issue with the SQL Service that caused an error to be produced when using SELECT with an unspecified limit for results. For example: select id from myapp.

  • Changed the way the SDK plugin installation is handled in multi-node environments to reduce the chance of errors.

  • Changed the way installing or updating plugins in a multi-node environment is handled to reduce the chance of errors.

  • Fixed an issue that allowed users without the necessary access writes to upload blobs using importPolicy=merge and importPolicy=abort.

  • Fixed an issue that caused the HTML parser to fail to parse an HTML document unless its contentType value was strictly text/html. For example, contentType = "text/html; charset=utf-8" would fail to parse.

  • Fixed an incorrectly listed metadata field in the SharePoint connector, Description_s, so it used the correct field, Description.

  • Fixed an issue in Windows 10 that caused the logs filter in the DevOps Center to be missing some fields and options.

  • Fixed an issue that resulted in the SDK connectors service crawling all documents in a collection despite the maximum output count being set to 5.

  • Fixed an issue that caused the connectors-rpc service to fail if a plugin was deleted when the service was stopped.

  • Fixed the Job History tab so all previous jobs are displayed.

  • Fixed an issue with Google Drive connector that prevented authorized users from overriding security trimming settings as desired.

  • Fixed an issue that caused an error to occur when a password-secured datasource was saved.

  • Fixed an issue that occasionally caused some documents to be skipped due to an immense term error caused by a lengthy value in the signature_s field while indexing a SharePoint load test environment.

  • Fixed an issue that caused a NullPointerException error to result when conducting SharePoint security trimming on a user with no assigned groups.

  • Fixed an issue that caused case sensitivity for file extensions in connectors.

  • Improved scaling for Phrase Detection jobs.

  • Fixed an issue that caused an NPE error to be given when attempting to use the /api/connectors/security_filter endpoint without specifying a username.

  • Fixed a compatibility issue between the Confluence connector and certain versions of Confluence.

  • Fixed an issue with the SharePoint connector that sometimes caused URLs to become invalid when data was recrawled.

  • Fixed an issue with the Confluence connector which resulted in anonymous access permissions not being updated as expected.

  • Fixed an issue with the SharePoint connector that resulted in documents remaining in Fusion despite being deleted from SharePoint.

  • Fixed an issue with the SharePoint connector that caused crawls to fail with a NullPointerException error if the crawl was attempted with no parser set.

  • Fixed a bug that prevented regex searches from functioning unless root was specified as a value.

  • Fixed a bug with the Apply Rules stage which forced Lucene to serve as the parser if boost, bury, block, or filter rules were used.

  • Fixed a bug that prevented minimum match list rules from working as expected.

Known issues

  • The time-decay function may not work as expected, depending on rollup settings.

  • Fusion may not isolate collections between applications correctly within the UI.

  • The WebFetcher service might fail to emit multiple documents from the parser.

  • The Windows Share SMB 2/3 Connector’s Access Control Lists may not be indexed due to a size limit exceeded error.