3.0.1 Release Notes

Release date: 4 April 2017

Component versions:

  • Solr 6.4.2

  • ZooKeeper 3.4.6

  • Spark 1.6.3

Important
Fusion 3.0.1 ships with Solr 6.4.2, which fixes a vulnerability to a ReplicationHandler path traversal attack which can leave any file readable to the Solr server process exposed unless the server is protected and restricted by firewall rules or authentication. If you are using an earlier version of Solr, be sure to upgrade to 6.4.2. If you must use an earlier version, set up proper firewalls, or disable ReplicationHandler if not in use.

New features

  • Support for Sharepoint 2016

    Fusion 3.0.1 includes many improvements to the Sharepoint connector, including support for Sharepoint 2016 and a new anda.forceRefresh.clearSignatures system property that controls whether to clear each item’s signature when forceRefresh is "true" (the default). To use this property, add it to CONNECTORS_JAVA_OPTIONS in config.sh like this:

    CONNECTORS_JAVA_OPTIONS=(-Xmx2g -XX:MaxPermSize=256m -Dapple.awt.UIElement=true -Danda.forceRefresh.clearSignatures)
  • New Box.com connector features

    • Support for proxying.

    • Support for security trimming.

    • Box.com access control lists are now indexed in the acl_users_ss and acl_groups_ss fields.

    Some new configuration keys are introduced in this release:

    • parserId

    • enable_security_trimming

    • f.isSecurityGroupTrimming

    • f.fs.proxyHost

    • f.fs.proxyPort

    • f.fs.proxyType

    • f.excludedMimeTypes

    • f.includedMimeTypes

    • f.fs.max_request_attempts

    • f.fs.user_filter_term

    Additionally, the connector now generates a clickable Box.com URL for all files.

  • Google Drive connector: support for security trimming, enterprise crawl, and Google native formats

    Security trimming for this connector is now fully functional. Additionally, Google’s native file formats are now indexed correctly.

    These new configuration keys are introduced in this release:

    • enable_security_trimming

    • f.fs.applyGroupSecurityFiltering

    • f.fs.cache_expiration_time

    • f.fs.cache_max_size

    • f.fs.defaultDomain

    • f.fs.security_filter_cache

    • f.fs.userExcludeList

    • f.fs.userSearchQuery

    • f.excludedMimeTypes

    • f.includedMimeTypes

    • f.fs.mime_type_excludes

    • f.fs.mime_type_includes

    • f.fs.serviceAccountEmail

    • f.fs.serviceAccountId

    • f.fs.serviceAccountPrivateKeyFile

    • f.fs.serviceAccountPrivateKeyFilePassword

    • f.fs.userExcludeList

    • f.fs.userSearchQuery

  • Solr 6.4.2

    Fusion 3.0.1 ships with Solr 6.4.2, which fixes the potential vulnerability described above. See more details about this release here.

Improvements

  • MongoDB connector

    • To support incremental crawls, the perform_initial_sync configuration property has been removed and the connector no longer stores timestamps in a file on disk.

    • Failover and recovery are improved in this release.

  • Advanced Boosting query stage

    This stage now correctly recognizes scaleMin and scaleMax values, and allows removal of those values once they are added.

  • A simplified upgrade procedure lets you upgrade easily from Fusion 3.0.0.

  • The Spark-Solr integration tool has been upgraded to version 2.3.4.

  • Aggregations run only if new signals exist.

    The default aggregations schedule runs aggregation jobs every two minutes. Fusion 3.0.1 first checks whether any new signals have arrived since the previous job; if so, then it skips the next job. This explicit check can be skipped by setting fusion.spark.skipJobIfSignalsEmpty to "false" using the Configurations API:

    curl -u admin:password123 -X POST -H "Content-type:application/json" -d "false" http://localhost:8764/api/apollo/configurations/fusion.spark.skipJobIfSignalsEmpty
  • SAML security realm configuration now includes two new fields:

    • App Issuer customizes the URI of the SAML app SP issuer.

    • Post Login Redirect URL customizes the redirect URL, for example if Fusion is behind a firewall.

  • Jive connector improvements

    The Jive connector now correctly indexes access control lists and archive files. Additionally, two new configuration keys are introduced:

    • max_retries: The maximum number of retries for a failed request.

    • retry_delay: The number of milliseconds to wait before retrying a failed request.

  • The Query Workbench now correctly displays highlighting and JSON-formatted previews.

  • A new start wait timeout configuration key, api.startSecs, has been added to accommodate environments such as EC3 where start times may be longer than the default of 3 minutes.

  • The Drupal connector now supports a parserId configuration key.

Other changes

  • The User Recommendation Boosting query pipeline stage has been removed.

  • Fusion now correctly reads the solr.port key in fusion.properties.

  • Fusion now captures heap dumps on out-of-memory (OOM) events by default, and stores them at var/log/<component>. Each heap dump file is overwritten on the next OOM occurrence.

  • In the event of any OOM exception, Fusion now runs its oom.sh script to terminate the affected process. This behavior can be disabled by adding default.killOnOOM = false to fusion.properties:

  • Fusion’s Spark driver now recovers gracefully after errors during a script job submitted via the Spark Jobs API.

  • The Fusion agent now manages memory correctly when running repeated status operations.

  • The trusted-http security realm now correctly handles subsequent requests without the need to remove the session cookie.

  • The work/ directories for Spark are now cleaned up on JVM shutdown.