3.1.1 Release Notes

Release date: 17 August 2017

Component versions:

  • Solr 6.5.1

  • ZooKeeper 3.4.6

  • Spark 2.1.1

New features

No new features are introduced in this maintenance release.

Improvements

  • Security

    All Fusion services can now be SSL-enabled. In this release, the UI service has a new key called ui.ssl in the fusion.properties file. Set this key to "true" to enable SSL for the UI service.

  • Query pipeline stages

    • Default scale ranges were changed for these query pipeline stages, when recommendations are enabled for a collection:

      Pipeline Stage Default Boost Scaling

      collection-items_for_item_recommendations

      Recommend Items for Item

      Minimum = 5, Maximum = 15

      collection-items_for_item_recommendations

      Boost with Signals

      Minimum = 1.1, Maximum = 5

      collection-items_for_user_recommendations

      Recommend Items for User

      Minimum = 5, Maximum = 15

      collection-items_for_user_recommendations

      Boost with Signals

      Minimum = 1.1, Maximum = 5

      These scale ranges give better recommendation behavior out-of-the-box.

    • Solr query parameters for the Boost with Signals query-pipeline stage are now configurable in the Fusion UI, on a per-pipeline basis:

      Solr query parameters

    • Log collection for a Boost with Signals query-pipeline stage no longer grows with an unbounded number of fields.

  • Google Drive connector

    • The Google Drive connector exports spreadsheets to Excel spreadsheets (.xlsx format files with MIME type application/vnd.openxmlformats-officedocument.spreadsheetml.sheet). Previously, spreadsheets were exported to PDF files.

    • The Google Drive connector now has configuration parameters that let you use the complete set of search clauses for the Google Drive API files.list method. The fields, operators, and value types are listed here:

    • If Fusion crashes while a datasource job is crawling a Google Drive datasource, then resuming the job resumes the original crawl of whatever type (e.g. crawling all items or an incremental crawl). Previously, when resuming a crawl after a crash, Fusion always began performing an incremental crawl.

    • An incremental crawl continues when the user account for a document owner no longer exists. Fusion logs a warning to the connector logfile.

    • A full crawl continues when creating user signatures (it is no longer stopped by 401 Unauthorized errors).

  • Connectors and upgrading

    The Connectors API automatically maps old types to new types for URL parameters. This prevents URL references from breaking because of connector plug-in name changes. For example, the type /connectors/plugins/types/lucid.anda/web is automatically mapped to /connectors/plugins/types/lucid.web/web.

  • Apache Tika parser

    The configuration parameters maxDepth (Maximum nesting depth) and maxPackageEntryDepth (Maximum package entry depth) were added to the Apache Tika parser. Default values in the Fusion UI are 200 and 20 respectively.

  • Automatic pushdown of time-range data to Solr

    Timestamp ranges in queries to Fusion SQL are now automatically sent to Solr via built-in predicate pushdown.

  • Spark jobs subtypes

    Additional Spark job subtypes were added – Co-occurrence Similarity, Item Similarity Recommender, Levenshtein, Collection Analysis, Statistically Interesting Phrases (SIP), Doc Clustering, Outlier Detection, and Cluster Labeling.

Other changes

  • Fusion Agent

    • A heap dump on out-of-memory (HeapdumponOOM) now succeeds when supervisor = enabled.

    • Logging improvements now log INFO level to stdout and WARN level to stderr. Additionally, when the Fusion Agent discovers that a monitored process exits, the Fusion Agent logs a message with ERROR level; for example:

      2017-07-17T18:31:55,347 - ERROR [api service-monitor:LocalRunningService@481] - api process with PID Optional[34552] failed
  • Security Trimming query pipeline stage

    Creation of the security filter has been parallelized across connectors.

  • Google Drive connector

    The Google Drive connector now fetches new users when performing an incremental crawl.

  • JavaScript connector

    The JavaScript connector now uses cookies for authentication.

  • Jira connector

    At indexing time, the Jira connector now uses the issue reporter of the parent issue when creating the ACL field for attachments, comments, and worklogs.

  • Web connector

    Including and excluding files by extension (includeExtensions and excludeExtensions) now work.

  • Jive connector

    • The Jive connector now crawls people and profile pages.

    • The connector now checks for the next batch of items even when the previous batch contained fewer than the configured number of batch items. This works around a Jive issue which sometimes returns fewer items than requested, even when there are more in the total data set.

    • The Jive connector now retries requests for which the response contained a 5xx HTTP status code.

  • MongoDB connector

    The MongoDB connector no longer stops pipeline processing prematurely when the op_log property is used (that is, it keeps listening for MongoDB changes) .

  • Windows Share connector

    The Windows Share connector now allows guest users (for which the password field is left empty).

  • Analytics catalog

    Support has been added for using field aliases with the Catalog API. For example:

    {
      "name": "shub_signals",
      "assetType": "table",
      "projectId": "fusion",
      "description": "SearchHub signals",
      "tags": ["shub"],
      "format": "solr",
      "cacheOnLoad": false,
      "options": ["collection -> shub_signals", "solr.params -> sort=id asc", "fields -> timestamp:timestamp_tdt,signal_type:type_s,ua_family:params.useragent_family_s,ua_os:params.useragent_os_family_s,tz:params.tz_s,num_found:params.totalResults_s,lang:params.lang_s,ua_type:params.useragent_type_name_s,query_terms:params.terms_s,query_id:params.query_unique_id,ua_vers:params.useragent_v,doc0:params.doc_0,doc1:params.doc_1,doc2:params.doc_2,pubdate_range:params.facet_ranges_publishedOnDate_before_d,user_id:params.uid_s,referrer:params.refr_s,ua_category:params.useragent_category_s,session_id:params.sid_s,ip:ip_sha_s,num_visits:params.vid_s,page_name:params.page_s,fingerprint:params.fp_s"]
    }
  • Bug fixes

    • You can now disable security trimming after it was previously enabled.

    • Form validation is used for the Spark jobs schema.

    • The collection picker works for Spark jobs.

Known Issues

  • In this release, passwords are visible as plaintext while the user is entering them in object configuration form fields in the Fusion UI. Once they are saved, passwords are not exposed. This issue is fixed in Fusion 3.1.2.

  • Time-based partitioning does not work with the searchlogs collection.