Upgrade Fusion

After you have a Fusion-based search application running, at some point it might be necessary to upgrade to a later version of Fusion. Your goal is to transfer over all of your data together with all configurations and customizations necessary to support your applications.

In this topic, we:

Tip
See the release history to find out what’s new, including which versions of Solr, Spark, and ZooKeeper are bundled with each Fusion release.

Per-version instruction sets

To upgrade to a later version of Fusion from an existing installation requires transferring over all configurations and data from your existing Fusion installation to the new version.

Perform the steps in the appropriate section:

Upgrade overview: migrating data, configurations, and customizations

The Upgrade Process

The upgrade process leaves the current Fusion deployment in place while a new Fusion deployment is installed and configured. All of the upgrade operations copy information from the current Fusion over to the new Fusion. This provides a rollback option should the upgrade procedure encounter problems.

The current Fusion configurations must remain as-is during the upgrade process. In order to capture indexing job history, no indexing jobs should be running. If the new Fusion installation is being installed onto the same server that the current Fusion installation is running on, you must either run only one version at a time or else change the Fusion component server ports so that all components are using unique ports for both the current and new versions.

ZooKeeper

Migration consists of the following steps:

  • Copy the ZooKeeper data nodes which contain Fusion configuration information from the FUSION-CURRENT ZooKeeper instance to the FUSION-NEW ZooKeeper instance

  • Rewrite Fusion datasource and pipeline configurations, working against the FUSION-NEW ZooKeeper instance

Important
Because some Fusion configurations have changed, ZooKeeper data must be rewritten accordingly using scripts available in the public GitHub repository: https://github.com/LucidWorks/fusion-upgrade-scripts.

Solr

Fusion-based search applications store your data in Solr. If your data is stored in an external Solr cluster, and if you aren’t upgrading your Solr cluster, then you don’t need to migrate your Solr data at all; you just need to configure the new Fusion deployment to use your external Solr cluster.

Fusion uses Solr as a data store for server logs, search logs, and binary components such as jar files and compiled models used by Fusion pipelines. The Fusion distribution includes a complete Solr server and is configured to use this embedded Solr instance by default. If the current Fusion deployment uses the embedded Solr for its system collections, then you must copy all of these collections over to the embedded Solr instance included with the new Fusion distribution.

Connector services data: crawldb, database drivers

The directory fusion/3.0.x/data/connectors/lucid.jdbc contains third party JDBC driver files that have been registered with Fusion in order to run the JDBC connector.

The directory fusion/3.0.x/data/connectors/crawldb is managed by Fusion’s connector service. Fusion datasources that walk over websites, filesystems, or similar repository use the crawldb to store information about files visited during the crawl; this allows incremental updates and avoids data re-indexing. In current versions of Fusion, the default location is the Fusion directory fusion/3.0.x/data/connectors/crawldb.

Important
The crawldb data format was changed, therefore for upgrades from Fusion 1.2.x to the latest Fusion 2.1, these must be processed using a reformatting program that is available in the public upgrade scripts repository: https://github.com/LucidWorks/fusion-upgrade-scripts.

Pipeline services data: models used by pipeline stages

The directory fusion/3.0.x/data/connectors/lucid.jdbc contains third party JDBC driver files that have been registered with Fusion in order to run the JDBC connector.

Customized settings for Fusion run commands and configuration scripts

The scripts used to start, stop, and restart Fusion and its components are found in the top-level "bin" directory of the Fusion distribution. As of Fusion 2.0, the configuration scripts previously found in the "bin" directory were put in their own top-level directory called "conf".

Important
If you have customized the settings in these files and wish to carry these settings over to the new Fusion deployment the only way to do so is to edit the command and configuration scripts in the new Fusion deployment by hand. You cannot copy over the old configuration files because they may contain commands or settings which are no longer valid.

Custom Fusion connector plugins and pipeline stages

Custom Java components written for one version of Fusion must be re-compiled and and installed anew for the latest version of Fusion as Fusion’s Java API may have changed.

Version Incompatibilities

There are no version incompatibilities between Fusion 2.4 and later versions.