Deployment Types - Lucidworks documentation

The Fusion platform is designed to support enterprise search applications at any scale. You can deploy Fusion across multiple nodes in order to store large amounts of data or to achieve high processing throughput or both. Fusion consists of a number of Java processes that run in JVMs, including the api, connectors-classic, connectors-rpc, and admin-ui processes, and possibly others such as spark-master and spark-worker. When you start Fusion, the processes that start are listed. You might also see zookeeper and solr processes, depending on the cluster arrangement. For more information about Fusion components, see Fusion Components. A complete list of Fusion services:

Start or Stop Fusion

This topic explains how to start and stop Fusion Server and its services using the scripts in the bin directory below the Fusion home directory:

/opt/fusion/latest.x/bin (Unix)
C:\lucidworks\fusion\latest.x\bin (Windows)

Command summary

You can control all Fusion services at once under the management of the Fusion agent, or you can control services individually.How to control all services using the Fusion agent:

Unix. /opt/fusion/latest.x/bin/fusion <command>
Windows. C:\lucidworks\fusion\latest.x\bin\fusion.cmd <command>

How to control individual services:

Unix. /opt/fusion/latest.x/bin/<servicename> <command> For example: /opt/fusion/latest.x/bin/proxy restart
Windows. C:\lucidworks\fusion\latest.x\bin\<servicename>.cmd <command> For example: C:\lucidworks\fusion\latest.x\bin\proxy.cmd restart

When starting services individually, start Zookeeper first.

The commands below can be issued to the fusion/fusion.cmd script to issue the command to all services in the correct sequence, or they can be issued to an individual service.


`start`	Start one or all Fusion services.
`status`	Display the status of one or all Fusion services.
`restart`	Restart one or all Fusion services.
`stop`	Stop one or all Fusion services.
`run`	Start one or all Fusion services in the foreground.
`run-in-shell` (Unix only) (Fusion 4.2+ only.)	Start an individual service using Bash’s `exec` function, which allows the service to assume the shell process’s PID. See Run Fusion in shell mode below.

Define groups of services

The fusion.cors (fusion.properties in Fusion 4.x) file includes the property definition group.default. This property defines the Fusion services to start and stop by default (if no property is named in the start or stop command).The default list of services out-of-the-box is also the minimum set of services, with the exception of the log-shipper service, which you can remove if you do not use it.Here is the group.default definition in fusion.cors (fusion.properties in Fusion 4.x):In Fusion 4.1+.

group.default = zookeeper, solr, api, connectors-classic, connectors-rpc, proxy, webapps, admin-ui, log-shipper

With the exception of the log-shipper service, these are all required services. Even if only using RPC connectors, the connectors-classic service is required. The log-shipper service is required to use the Log Shipper.In Fusion 4.0.x.

group.default = zookeeper, solr, api, connectors-rpc, connectors-classic, admin-ui, proxy, webapps

Even if only using RPC connectors, the connectors-classic service is required.

How to modify the default list of services

In Fusion 4.1+.Edit the group.default property, for example, to include Spark and SQL related services:

group.default = zookeeper, solr, api, connectors-classic, connectors-rpc, proxy, webapps, admin-ui, log-shipper, spark-master, spark-worker, sql

In Fusion 4.0.x.Edit the group.default property, for example, to include Spark related services:

group.default = zookeeper, solr, api, connectors-classic, connectors-rpc, proxy, webapps, admin-ui, spark-master, spark-worker

How to define other lists of services (Unix)

You can define other lists of services by defining other group properties.In Fusion 4.1+.For example, define this group property to start and stop services for Spark and SQL together:

group.spark-only = spark-master, spark-worker, sql

In Fusion 4.0.x.For example, define this group property to start and stop services for Spark together:

group.spark-only = spark-master, spark-worker

Define this group property to start and stop services for classic and RPC connectors together:

group.connectors = connectors-classic, connectors-rpc

Unix

Start and stop Fusion on Unix.

Start Fusion

All Fusion start scripts must be executed by a user who has permissions to read and write to the directories where Fusion is installed. These scripts do not need to be run as root (or sudo), nor should they be. Use a suitable user, or create a new one, and then ensure that it owns the directory where Fusion resides, (for example, C:\lucidworks).Give the commands that follow from the directory fusion/latest.x/bin.

Start required services

Start the required services that are defined in the group.default property.How to start all required services./fusion start

This is equivalent to ./fusion start default. You can omit the group name default.

Start a group of services

You can start a group of services together. Reference the property in fusion.cors (fusion.properties in Fusion 4.x) that defines the group.Examples of when this is useful are:

Spark and SQL. The spark-master, spark-worker, and sql services are interdependent and should be started and stopped together.
```
./fusion start spark-master spark-worker sql
```
Classic and RPC connectors. RPC connectors require both the connectors-classic and connectors-rpc services to be running.
```
./fusion start connectors-classic connectors-rpc
```

Start services individually

You can start services individually.How to start services individually

Fusion UI service: ./admin-ui start
API services: ./api start
Classic Connectors services: ./connectors-classic start
RPC Connectors services: ./connectors-rpc start
Log shipper service (Fusion 4.1+ only): ./log-shipper start
Proxy: ./proxy start
Solr: ./solr start
Spark Master: ./spark-master start
Spark Worker: ./spark-worker start
SQL service: ./sql start
Web Apps: ./webapps start
ZooKeeper: ./zookeeper start

For information about default ports, see Default Ports.

Run Fusion in the foreground

To run Fusion or any of its services in the foreground, use the run command-line argument in place of start.

Run Fusion in shell mode

This section applies to Fusion 4.2+ only.

To start any of Fusion’s services using Bash’s exec function, which allows the service to assume the shell process’s PID, use the run-in-shell command-line argument in place of start or run. The run-in-shell argument can only be used to start one service at a time.Examples

./fusion run-in-shell zookeeper

./zookeeper run-in-shell

Shell mode is particularly useful in containerized environments, which generally assume that only one process runs per container and that process is “process 0”, that is, the initial process invoked within the container, not a separate spawned process.

Stop Fusion

How to stop Fusion servicesTo stop Fusion or any of its services, use the command above with the stop command-line argument in place of start, for example:./solr stop

Using systemd to manage processes

On Red Hat Enterprise Linux, CentOS 7 and newer, and Ubuntu 15.04 LTS and newer, we support using the operating system-provided systemd for process management.

Launching Fusion at system start

You can configure systemd to launch Fusion when your system starts.How to launch Fusion at system start:

Change your working directory to Fusion’s systemd directory, for example:
```
cd /opt/fusion/latest.**__x__**/init/systemd
```
Edit fusion.service to provide correct values for the FUSION_HOME and JAVA_HOME environment variables.

Stop Fusion if it is already running:

/opt/fusion/latest.**__x__**/bin/fusion stop

Create the systemd management file, which launches Fusion under systemd management:
```
sudo bash install.sh
```

Starting and stopping Fusion

You can use the systemctl command to start and stop Fusion:

sudo systemctl start fusion
sudo systemctl stop fusion

Log files for Fusion services are found in directories under https://FUSION_HOST:FUSION_PORT/var/log.

Using Ubuntu Upstart to manage processes

Under Ubuntu 12.04 LTS through Ubuntu 14.10, we support using Upstart for process management. This requires Fusion to be installed in the /opt/lucidworks/ directory.To configure upstart, run the following commands:

$ cd /opt/lucidworks/fusion/latest/init/upstart
$ sudo bash install.sh

If this complains with no JAVA_HOME set, replace sudo with sudo -E. Then you can use the service command to control the server:

$ sudo service fusion-solr start
$ sudo service fusion-api start
$ sudo service fusion-connectors start
$ sudo service fusion-ui start

and similarly use stop and status.Log files for Fusion services are found in directories under https://FUSION_HOST:FUSION_PORT/var/log.

Windows

Start and stop Fusion on Windows.

Start Fusion

All Fusion start scripts must be executed by a user who has permissions to read and write to the directories where Fusion is installed. Ensure that the user owns the directory where Fusion resides (for example, C:\lucidworks).Give the commands that follow from the directory fusion\latest.x\bin.

Start required services

How to start all required Fusion services as Java processes

fusion.cmd start

How to start all required Fusion services as Windows services

start-services.cmd

Start services individually

How to start specific services as Java processes

UI service: admin-ui.cmd start
API services: api.cmd start
Classic Connectors services: connectors-classic.cmd start
RPC Connectors services: connectors-rpc.cmd start
Log shipper service (Fusion 4.1+ only): log-shipper.cmd start
Proxy: proxy.cmd start
Solr: solr.cmd start
Spark Master: spark-master.cmd start
Spark Worker: spark-worker.cmd start
SQL service: sql.cmd start
Web Apps: webapps.cmd start
ZooKeeper: zookeeper.cmd start

For information about default ports, see Default Ports.

Run Fusion in the foreground

To run Fusion or any of its services in the foreground, use the run command-line argument in place of start, for example:connectors.cmd run

Stop Fusion

How to stop all Fusion services

fusion.cmd stop (Stop all Fusion services, if they are running as Java processes)
stop-services.cmd (Stop all Fusion services, if they are running as Windows services)

To stop a specific Fusion service that is running as a Java process, use the command above with the stop command-line argument in place of start, for example:connectors.cmd stop

Run Fusion with a service account

This example assumes the following:

Field	Value
Account	FUSION_SVC
Domain	qe
Installation directory	`C:\fusion\<version>`
Server	EC2AMAZ-79FD9JL

As an administrator, create the service account, and install it to the server you want to use for Fusion:

C:\Users\Administrator>New-ADServiceAccount -Name "FUSION_SVC" -RestrictToSingleComputer
C:\Users\Administrator>Add-ADComputerServiceAccount -Identity EC2AMAZ-79FD9JL -ServiceAccount "FUSION_SVC"
C:\Users\Administrator>Install-ADServiceAccount -Identity "FUSION_SVC"
C:\Users\Administrator>Test-ADServiceAccount "FUSION_SVC"
C:\Users\Administrator>Get-ADServiceAccount "FUSION_SVC"

Run install-services.cmd as a local administrator:

C:\Users\Administrator> C:\fusion\4.2.2\bin\install-services.cmd
ECHO is off.
              Thank you for choosing
 ====================================================
"  _            _     _                    _         "
" | |          (_)   | |                  | |        "
" | |_   _  ___ _  __| |_      _____  _ __| | _____  "
" | | | | |/ __| |/ _` \ \ /\ / / _ \| '__| |/ / __| "
" | | |_| | (__| | (_| |\ V  V / (_) | |  |   <\__ \ "
" |_|\__._|\___|_|\__._| \_/\_/ \___/|_|  |_|\_\___/ "
 ====================================================
You will now be prompted for the username and password of the Windows account that will launch Fusion.
IMPORTANT NOTE 1: When prompted 'Set Account rights to allow log on as a service', enter 'Y'
IMPORTANT NOTE 2: You must enter the username in domain\username format.
.... Starting winsw (https://github.com/kohsuke/winsw) service wrapper utility ...
2019-06-12 17:37:21,737 INFO  - Starting ServiceWrapper in the CLI mode
Username: EC2AMAZ-79FD9JL\Administrator
Password: ************
Set Account rights to allow log on as a service (y/n)?: n
2019-06-12 17:38:02,970 INFO  - Completed. Exit code is 0

Grant the service account full access to Fusion:

C:\Users\Administrator>$path = "C:\fusion"
$Acl = Get-Acl "$path"
$permission = "qe\FUSION_SVC$", "FullControl", "ContainerInherit,ObjectInherit", "None", "Allow"
$accessRule = New-Object System.Security.AccessControl.FileSystemAccessRule $permission
$Acl.SetAccessRule($accessRule)
Set-Acl "$path" $Acl
Get-Acl "$path" | fl

Modify the Lucidworks Fusion service to use the service account.
1. Open Administrative Tools > Services on your Windows server.
2. Select the Lucidworks Fusion service.
3. Open the Properties > Log On dialog.
4. Change the service user account to the FUSION_SVC user account.
  Only the account name is required. The password is managed by Windows.

Access Fusion after startup

After Fusion services have started, you can open the Fusion UI in a web browser at http://localhost:8764/ (replace localhost with your server name or IP address if necessary).The first time you access Fusion, you must set the password for the user admin and agree to the Fusion Licensing Agreement (which contains terms of service). This is followed by an optional registration step. After this, Fusion displays the Fusion launcher (the page from which you can open apps).

Deployment goals

Demo, trial, and development deployments. The simplest possible architecture is the one you get out of the box, by unpacking the tar/zip file and running https://FUSION_HOST:FUSION_PORT/bin/fusion start, so that all components (including the bundled Solr and ZooKeeper instances) run on a single host in their default configurations.
You can quickly install and run Fusion on a computer (even on your laptop) to explore its features and work with sample data. See Quickstart for instructions. This diagram illustrates a single-node Fusion deployment:

Quickstart

The Quickstart is a wizard that lets you explore some of Fusion’s core capabilities:

Creating or selecting an app
Selecting a collection (where data is stored)
Indexing data
Searching

LucidAcademyLucidworks offers free training to help you get started.The Course for Using The Quickstart Wizard focuses on using the wizard to practice using the key functions of Fusion:

Visit the LucidAcademy to see the full training catalog.

Step through the Quickstart wizard

If it is not already open, open the Quickstart wizard. In the Fusion launcher, click New here? Get started….

Click Continue. On the Select an App screen, you can select or create an app for your quickstart data. If you click Create new app, Fusion prompts you to enter an app name (and optionally a description), then click Create App to return to the Quickstart. Your new app is selected by default.
After selecting an app, click Continue. On the Select a Collection screen, you can select or create a collection for your quickstart data. A collection with the same name as your app is created automatically:
Click Continue.
On the Index Data screen, you can either select one of the built-in sample datasets or click Use my data to upload your own:
Click Continue.
On the Query Data screen, you can see all search results and enter your own search queries to test the indexed dataset. You can also select the display fields or view the parsed documents:
Click Continue. From here, you can do the following:
- Open the Index Workbench to change the index pipeline.
  - Fusion 4.x Index Workbench
  - Fusion 5.x Index Workbench
- Open the Query Workbench to change the query pipeline.
  - Fusion 4.x Query Workbench
  - Fusion 5.x Query Workbench
  The workbenches are essentials tools in the Fusion workflow.
- Open App Studio to create a user interface for searching this collection.
  App Studio is only available in Fusion 4.1 and 4.2.

Onsite late-stage development and test deployments. Ideally, an onsite deployment for late-stage development and testing should have the same architecture as the production deployment, though it does not need to be scaled to provide the same level of service.
Production deployments. Fusion is designed for flexible, distributed deployment. Any of its components can be distributed across your network, and some can be clustered. A production deployment requires multiple Fusion nodes, each of which runs some or all Fusion services (including Solr and ZooKeeper).

Cluster Arrangements

You can deploy Fusion across multiple nodes in a Fusion cluster and use a ZooKeeper cluster as the centralized, synchronized store for both application configurations and user data. Regarding Solr, if you already have SolrCloud clusters managing your data, you can Integrate Fusion 4.x with an Existing Solr Deployment.

Integrate Fusion 4.x with an Existing Solr Deployment

If you have already implemented Solr as a standalone instance or as a SolrCloud cluster, you can add Fusion to your existing Solr deployment (if the Solr version is supported) and import your Solr collections into Fusion. Each Fusion collection can import one Solr collection.

If your existing Solr instance is running in SolrCloud mode, you can use Fusion’s UI to modify configuration files (such as schema.xml or solrconfig.xml) and create Solr collections.
If your existing Solr instance is running in standalone mode, you can still connect it to Fusion. Fusion can send documents to a standalone Solr instance and query the instance. But you will not be able to use Fusion’s UI to create Solr collections (Solr cores) or to modify Solr configuration files.

Prerequisites

Failures in the Fusion install or startup may occur if the Fusion installation directory contains a space.

You have already installed Fusion.
You have already installed Solr, which must meet these Solr requirements.
You have already installed ZooKeeper, which must meet these ZooKeeper requirements.
We recommend that you create an external ZooKeeper cluster (external to both Fusion and SolrCloud).
Your Solr deployment must contain one or more collections (cores).
In SolrCloud mode, Solr must be configured to use ZooKeeper.

Configure Fusion to use an existing Solr deployment

Use the Fusion UI to integrate Fusion with an existing Solr deployment.

Use the Fusion UI

Create a Fusion search cluster:
1. In the Fusion UI, navigate to System > Solr Clusters and click New Solr Cluster.
2. Enter this information:
- A cluster ID of your choice
- Whether SolrCloud is enabled
- The connect string (to tell Fusion how to connect to the SolrCloud cluster or Solr instance)
  - For SolrCloud, this is the ZooKeeper connect string.
  - For a standalone Solr instance, this is the URL of the Solr instance.
1. Verify that the connection is working by clicking Cores in the new cluster and inspecting the contents.
Create a Fusion collection that points to your Solr cluster and collection:
1. In the UI, navigate to Collections and click Add a Collection.
2. Enter a name for the new collection.
3. Click Advanced.
4. Select your SolrCloud cluster or Solr instance from the dropdown.
5. Enter the name of the Solr collection to import.

Sending Documents to Solr through Fusion

You can use the Fusion connectors to crawl documents and index them to your existing Solr deployment.

Follow the steps above to create and configure a search cluster and a collection that points to Solr.
Define an index pipeline that ends with a Solr Indexer stage that sends the documents to Solr.
Use one of these methods to ingest your data:
- In the collection that points to your Solr collection, define a datasource using the connector of choice.
- Send prepared documents directly to the index pipeline for processing. See Importing Data with the REST API.
- It is also possible to use a different indexing process besides a connector, such as a script that sends documents through the index pipeline.

When documents are sent to Solr, a buffering SolrServer is used. Buffering the updates reduces the number of HTTP requests made from Fusion to Solr, which can significantly affect processing time. For example, when processing simple documents, you should always try to buffer as many documents as possible to increase throughput. When processing complex documents, you should use small batch sizes. You should only turn buffering off if you are using an older version of Solr and you want Fusion to catch and document indexing errors.

Querying Solr via Fusion requests

Indexed documents are stored in Solr indexes. You can query for these documents by using query pipelines. The query pipelines let you define your query parameters - such as how many records to return, the fields you would like, how to structure facets, and so on. You also have the ability to add JavaScript to the response processing, and define landing pages or specific boost levels depending on the user’s query. See Query Pipelines.If you prefer, you can also use the Solr API and SolrAdmin API to query Solr directly.

To satisfy processing requirements, install Fusion, ZooKeeper, and Solr on specific nodes. These are the possibilities:

Nodes running core Fusion services and Solr also run ZooKeeper.
In this cluster arrangement, a ZooKeeper cluster runs on the same nodes that run core Fusion services and Solr. This arrangement works well for a small cluster with low usage and can reduce cost when compared with placing ZooKeeper on separate nodes.
Nodes running ZooKeeper are not running core Fusion services or Solr.
In this cluster arrangement, the ZooKeeper cluster runs on nodes in the Fusion cluster on which core Fusion services and Solr are not running. This arrangement is good for larger or more active clusters and can help reduce write latency. This also lets Fusion nodes scale horizontally without impacting the ZooKeeper nodes.

Learn more

Install a Fusion 4.x Cluster (Unix)

This article describes how to install a Fusion cluster on multiple Unix nodes. Instructions are given for each of the cluster arrangements described in Deployment Types.

Preliminary steps

Before proceeding to one of the sections that follow, perform these steps:How to prepare for setting up a Fusion cluster

Prepare your firewall so that the Fusion nodes can communicate with each other. The default ports list contains a list of all ports used by Fusion. From this list, it is important that the ZooKeeper ports, Apache Ignite ports, and the Spark ports (if you are using Spark) are open between the different nodes for cross-cluster communication.
Fusion for Unix is distributed as a compressed archive file (.tar.gz). Move this file to each node that will run Fusion.
To leverage the copies of Solr and/or ZooKeeper that are distributed with Fusion on nodes that will not run Fusion (as a simple means of obtaining compatible versions of the other software), also download the Fusion compressed archive file to each of those nodes. Below, you will edit configuration files so that Fusion does not run on those nodes.
On each node, change your working directory to the directory in which you placed the Fusion tar/zip file and unpack the archive, for example:
```
$ cd /opt/lucidworks
$ tar -xf fusion-version.x.tar.gz
```
Failures in the Fusion install or startup may occur if the Fusion installation directory name contains a space.
The resulting directory is named https://FUSION_HOST:FUSION_PORT. You can rename this if you wish. This directory is considered your Fusion home directory. See Directories, files, and ports for the contents of the https://FUSION_HOST:FUSION_PORT directory.

In the sections that follow, for every step on multiple nodes, complete the step on all nodes before going to the next step. It is especially important that you do not start Fusion on any node until the instructions say to do so.

In the steps below, the port numbers reflect default port numbers and one common choice (port 2181 for nodes in an external ZooKeeper cluster). Port numbers for your nodes might differ.

Nodes running core Fusion services and Solr also run ZooKeeper

In this cluster arrangement, a ZooKeeper cluster runs on the same nodes that run core Fusion services and Solr.

Perform the steps in the section Preliminary steps, and then perform these steps:

Assign a number to each Fusion node, starting at 1. We refer to the number we assign to each node as the ZooKeeper myid.
On each Fusion node, create a https://FUSION_HOST:FUSION_PORT/data/zookeeper directory, and a file called myid in that directory. Edit the file and save the ZooKeeper myid assigned for this node as the only contents.
On each Fusion node, open the https://FUSION_HOST:FUSION_PORT/conf/zookeeper/zoo.cfg file in a text editor and add the following after the clientPort line (change the hostnames or IP addresses to the correct ones for your servers):
```
server.1=[Hostname or IP for ZooKeeper with myid 1]:2888:3888
server.2=[Hostname or IP for ZooKeeper with myid 2]:2888:3888
server.3=[Hostname or IP for ZooKeeper with myid 3]:2888:3888
```
For example:
```
server.1=10.10.31.130:2888:3888
server.2=10.10.31.178:2888:3888
server.3=10.10.31.166:2888:3888
```
Do not use localhost or 127.0.0.1 as the hostname/IP. Specify the hostname/IP that other nodes will use when communicating with the current node.
On each Fusion node, edit default.zk.connect in https://FUSION_HOST:FUSION_PORT/conf/fusion.cors (fusion.properties in Fusion 4.x) to point to the ZooKeeper hosts:
```
default.zk.connect=[ZK host 1]:9983,[ZK host 2]:9983,[ZK host 3]:9983
```
On each node, start ZooKeeper with bin/zookeeper start. Zookeeper should start without errors. If a ZooKeeper instance fails to start, check the log at https://FUSION_HOST:FUSION_PORT/var/log/zookeeper/zookeeper.log.
On each node, start the rest of Fusion using bin/fusion start.
Create an admin password and log in to Fusion at http://FIRST_NODE_IP:8764, where FIRST_NODE_IP is the IP address of your first Fusion node.
Verify the Solr cluster is healthy by looking at http://ANY_NODE_IP:8983/solr/#/~cloud, where ANY_NODE_IP is the IP address of a Solr node. All of the nodes should appear green.
If necessary, prepare high availability by setting up a load balancer in front of Fusion so that it load balances between the Fusion UI URL’s at http://NODE_IP:8764.
Consult your load balancer’s documentation for instructions.

Nodes running ZooKeeper are not running core Fusion services or Solr

In this cluster arrangement, the ZooKeeper cluster runs on nodes in the Fusion cluster on which core Fusion services and Solr are not running.Each node in the Fusion cluster has Fusion and Solr installed. ZooKeeper runs on Fusion cluster nodes on which neither Fusion nor Solr is running.

How to set up a Fusion clusterPerform the steps in the section Preliminary steps, and then perform these steps:

Assign a number to each Fusion node, starting at 1. We refer to the number we assign to each node as the ZooKeeper myid.
On each Fusion node, create a fusion\latest.x\data\zookeeper directory, and a file called myid in that directory. Edit the file and save the ZooKeeper myid assigned for this node as the only contents.

On each Fusion node, open the fusion\latest.x\conf\zookeeper\zoo.cfg file in a text editor and add the following after the clientPort line (change the hostnames or IP addresses to the correct ones for your servers):

server.1=[Hostname or IP for ZooKeeper with myid 1]:2888:3888
server.2=[Hostname or IP for ZooKeeper with myid 2]:2888:3888
server.3=[Hostname or IP for ZooKeeper with myid 3]:2888:3888

For example:

server.1=10.10.31.130:2888:3888
server.2=10.10.31.178:2888:3888
server.3=10.10.31.166:2888:3888

Edit conf/fusion.cors (fusion.properties in Fusion 4.x) and remove zookeeper from the group.default list. This will make it so that ZooKeeper does not start when you start Fusion.
On each Fusion node, edit default.zk.connect in https://FUSION_HOST:FUSION_PORT/conf/fusion.cors (fusion.properties in Fusion 4.x) to point to the ZooKeeper hosts:
```
default.zk.connect=[ZK host 1]:2181,[ZK host 2]:2181,[ZK host 3]:2181
```
On each node, start ZooKeeper with bin/zookeeper start. Zookeeper should start without errors. If a ZooKeeper instance fails to start, check the log at https://FUSION_HOST:FUSION_PORT/var/log/zookeeper/zookeeper.log.
On each node, start the rest of Fusion using bin/fusion start.
Create an admin password and log in to Fusion at http://FIRST_NODE_IP:8764, where FIRST_NODE_IP is the IP address of your first Fusion node.
Verify the Solr cluster is healthy by looking at http://ANY_NODE_IP:8983/solr/#/~cloud, where ANY_NODE_IP is the IP address of a Solr node. All of the nodes should appear green.
If necessary, prepare high availability by setting up a load balancer in front of Fusion so that it load balances between the Fusion UI URL’s at http://NODE_IP:8764. Consult your load balancer’s documentation for instructions.

Known issues

Metrics collection failure

When the Java virtual machine (JVM) is started, the /tmp/.java_pid<pid> file is created and is the socket used:

To attach a debugger
By the agent to connect to the service that collects Java Management Extension (JMX) metrics

A known issue in Java 8 is that the timestamp is not updated, which causes the file to be deleted in standard Linux distribution systems. For example, the /tmp/.java_pid<pid> is deleted after ten days on a standard Amazon Linux in EC2.When the JVM code the agent uses cannot locate the file, then it:

Sends a -QUIT message to the JVM
Triggers a thread dump to be printed to standard out

The standard out:

Is logged to the agent log
Generates the “No metrics can be gathered” exception
Prints a complete thread dump
Sends the thread dump to system logs

Choose one of the two workarounds:

Exclude the agent.log from the logstash configuration. The logshipping is turned off for the file. The disadvantage to this option is that the metrics are missing.
Change the cron job in the Linux distribution that deletes the /tmp files older than “x” days to exclude deleting the /tmp/.java_pid<pid> files. If your system is running the Linux Systemd software suite on EC2, the setting is typically located in the usr/lib/tmpfiles.d/tmp.conf file. For Dial On Demand (DOD), remove the call that configures the JMX Metrics requirement for the debugger attachment to the Java service.

Install a Fusion 4.x Cluster (Windows)

This article describes how to install a Fusion cluster on multiple Windows nodes. Instructions are given for each of the cluster arrangements described in Deployment Types.

Preliminary steps

Failures in the Fusion install or startup may occur if the Fusion installation directory contains a space.

Before proceeding to one of the sections that follow, perform these steps:How to prepare for setting up a Fusion cluster

Prepare your firewall so that the Fusion nodes can communicate with each other. The default ports list contains a list of all ports used by Fusion. From this list, it is important that the ZooKeeper ports, Apache Ignite ports, and the Spark ports (if you are using Spark) are open between the different nodes for cross-cluster communication.
Fusion for Windows is distributed as a compressed archive file (.zip). Download the Fusion zip file for the latest version of Fusion to each node that will run Fusion, and move the file to where you would like Fusion to reside in your filesystem. It will appear as a compressed folder.
To leverage the copies of Solr and/or ZooKeeper that are distributed with Fusion on nodes that will not run Fusion (as a simple means of obtaining compatible versions of the other software), also download the Fusion zip file to each of those nodes. Below, you will edit configuration files so that Fusion does not run on those nodes.
Unpack the archive. In most cases, you need only right-click and choose “Extract all…”. If you do not see this option, check that you have permissions to extract folders on your system.
The resulting directory is named fusion\latest.x. This directory is considered your Fusion home directory. You can rename it if you wish. See Directories, Files, and Ports for the contents of the Fusion home directory.

In the steps below, the port numbers reflect default port numbers and one common choice (port 2181 for nodes in an external ZooKeeper cluster). Port numbers for your nodes might differ.

Nodes running core Fusion services and Solr also run ZooKeeper

In this cluster arrangement, a ZooKeeper cluster runs on the same nodes that run core Fusion services and Solr.

How to set up a Fusion clusterPerform the steps in the section Preliminary steps, and then perform these steps:

Assign a number to each Fusion node, starting at 1. We refer to the number we assign to each node as the ZooKeeper myid.
On each Fusion node, create a fusion\latest.x\data\zookeeper directory, and a file called myid in that directory. Edit the file and save the ZooKeeper myid assigned for this node as the only contents.

server.1=[Hostname or IP for ZooKeeper with myid 1]:2888:3888
server.2=[Hostname or IP for ZooKeeper with myid 2]:2888:3888
server.3=[Hostname or IP for ZooKeeper with myid 3]:2888:3888

For example:

server.1=10.10.31.130:2888:3888
server.2=10.10.31.178:2888:3888
server.3=10.10.31.166:2888:3888

do not use localhost or 127.0.0.1 as the hostname/IP. Specify the hostname/IP that other nodes will use when communicating with the current node.

4. On each Fusion node, edit default.zk.connect in fusion\latest.x\conf\fusion.cors (fusion.properties in Fusion 4.x) to point to the ZooKeeper hosts:

default.zk.connect=[ZK host 1]:9983,[ZK host 2]:9983,[ZK host 3]:9983

On each node, start ZooKeeper with bin\zookeeper start. Zookeeper should start without errors. If a ZooKeeper instance fails to start, check the log at fusion\latest.x\var\log\zookeeper\zookeeper.log.
On each node, start the rest of Fusion using bin\fusion start.
Create an admin password and log in to Fusion at http://FIRST_NODE_IP:8764, where FIRST_NODE_IP is the IP address of your first Fusion node.
Verify the Solr cluster is healthy by looking at http://ANY_NODE_IP:8983/solr/#/~cloud, where ANY_NODE_IP is the IP address of a Solr node. All of the nodes should appear green.
If necessary, prepare high availability by setting up a load balancer in front of Fusion so that it load balances between the Fusion UI URL’s at http://NODE_IP:8764.
Consult your load balancer’s documentation for instructions.

Nodes running ZooKeeper are not running core Fusion services or Solr

How to set up a Fusion clusterPerform the steps in the section Preliminary steps, and then perform these steps:

Assign a number to each Fusion node, starting at 1. We refer to the number we assign to each node as the ZooKeeper myid.
On each Fusion node, create a fusion\latest.x\data\zookeeper directory, and a file called myid in that directory. Edit the file and save the ZooKeeper myid assigned for this node as the only contents.

server.1=[Hostname or IP for ZooKeeper with myid 1]:2888:3888
server.2=[Hostname or IP for ZooKeeper with myid 2]:2888:3888
server.3=[Hostname or IP for ZooKeeper with myid 3]:2888:3888

For example:

server.1=10.10.31.130:2888:3888
server.2=10.10.31.178:2888:3888
server.3=10.10.31.166:2888:3888

Edit conf\fusion.cors (fusion.properties in Fusion 4.x) and remove zookeeper from the group.default list. This will make it so that ZooKeeper does not start when you start Fusion.
On each Fusion node, edit default.zk.connect in fusion\latest.x\conf\fusion.cors (fusion.properties in Fusion 4.x) to point to the ZooKeeper hosts:
```
default.zk.connect=[ZK host 1]:2181,[ZK host 2]:2181,[ZK host 3]:2181
```
On each node, start ZooKeeper with bin\zookeeper start. Zookeeper should start without errors. If a ZooKeeper instance fails to start, check the log at fusion\latest.x\var\log\zookeeper\zookeeper.log.
On each node, start the rest of Fusion using bin\fusion start.
Create an admin password and log in to Fusion at http://FIRST_NODE_IP:8764, where FIRST_NODE_IP is the IP address of your first Fusion node.
Verify the Solr cluster is healthy by looking at http://ANY_NODE_IP:8983/solr/#/~cloud, where ANY_NODE_IP is the IP address of a Solr node. All of the nodes should appear green.
If necessary, prepare high availability by setting up a load balancer in front of Fusion so that it load balances between the Fusion UI URL’s at http://NODE_IP:8764. Consult your load balancer’s documentation for instructions.

Secure Communication with a SolrCloud Cluster

You can configure Fusion and an external SolrCloud cluster so that communication between Fusion and the SolrCloud cluster is secured. Use either Kerberos or basic authentication. You can secure communication for both the default search cluster and for other SolrCloud clusters.Note: Securing communication between Fusion and a bundled default search cluster is not supported.The required steps differ. These are the high-level steps. Detailed steps follow.

Default search cluster. Define configuration parameters for bootstrapping Fusion, and then bootstrap Fusion.
Other SolrCloud clusters. In the Fusion UI, add the external SolrCloud cluster.

Default Search Cluster

If your default search cluster is in an external SolrCloud cluster, then you can secure the cluster with Kerberos or basic authentication, and then configure Fusion to communicate securely with the cluster.

Prerequisite

Prerequisite: Secure the default search cluster. Use either Solr’s Basic Authentication Plugin or Kerberos Authentication Plugin.Don’t start Fusion yet. Below, you will define bootstrap properties, and then bootstrap Fusion.

Configure and Bootstrap Fusion

Create a .properties file for the initial bootstrap of Fusion. Place the file outside of the Fusion installation, for example, in /tmp. You will delete the file at the end of this procedure:
```
touch /tmp/fusion-bootstrap.properties
```
Edit the fusion-bootstrap.properties file to define Fusion initial-bootstrap configuration properties. Example strings are in bold italics. Replace those with your own values.

Consult with your Kerberos administrator about the correct configuration properties.
- Kerberos authentication – Specify the authentication type (kerberos), the Kerberos principal, and the Kerberos keytab file:
```
default-search-cluster.auth-type=kerberos
default-search-cluster.auth-principal=*_fusion@MYORG.ORG_*  
default-search-cluster.auth-keytab=/*_path-to-file_*/keytab.kt
```
- Basic authentication: – Specify the authentication type (basic), the username of the Solr user to use for authentication, and the password of that user:
```
default-search-cluster.auth-type=basic
default-search-cluster.auth-user=*_admin_*  
default-search-cluster.auth-password=*_admin-password_*
```
The Solr user must be the admin user or a different user with full administrative privileges. +

Fusion doesn’t support Solr authorization plugins.
Edit the fusion.cors file:
1. Uncomment and change the value of this property to point to an external ZooKeeper:
  # default.zk.connect = *_localhost:9983_*
2. Uncomment and change the value of this property to use an external SolrCloud cluster:
  # default.solrZk.connect = *_localhost:2181/solr-zk-namespace_*
3. Remove zookeeper and solr from the group.default property:
  group.default = api, connectors, ui
4. Add a configuration property for the path to the initial-bootstrap properties file:
  initial-bootstrap-properties-path = *_/tmp/fusion-bootstrap.properties_*
Change your working directory to the directory that contains the Fusion binaries:
```
$ cd ~/{path_to}https://FUSION_HOST:FUSION_PORT/bin
```
Bootstrap Fusion:
```
./fusion start
```
After Fusion starts:
1. Delete the initial-bootstrap properties file:
  $ rm *_/tmp/fusion-bootstrap.properties_*
2. Edit the fusion.cors (fusion.properties in Fusion 4.x) file to remove the entry for the initial-bootstrap properties file:
  initial-bootstrap-properties-path = *_/tmp/fusion-bootstrap.properties_*

Other SolrCloud Cluster

You can secure an external SolrCloud cluster with Kerberos or basic authentication, and then configure Fusion to communicate securely with the cluster.

Prerequisite

Prerequisite: Secure the default search cluster. Use either Solr’s Basic Authentication Plugin or Kerberos Authentication Plugin.

Add the secure SolrCloud cluster in the Fusion UI (Basic Auth)

log in to the Fusion UI as the user admin.
Click System > Home > System > Solr Clusters > New Solr Cluster.
Click Advanced.
Specify the required values ID and Connect String. Under Solr Cluster Authentication, check include. Choose Authentication Type basic, and specify a username and password for authentication.
Click Save new.

Add the secure SolrCloud cluster in the Fusion UI (Kerberos)

log in to the Fusion UI as the user admin.
Click System > Home > System > Solr Clusters > New Solr Cluster.
Click Advanced.
Specify the required values ID and Connect String. Under Solr Cluster Authentication, check include. Choose Authentication Type kerberos, and specify a Kerberos keytab file and Kerberos principal for authentication.
Click Save new.

Fusion Server

Fusion AI

App Studio

​Command summary

​Define groups of services

​How to modify the default list of services

​How to define other lists of services (Unix)

​Unix

​Start Fusion

​Start required services

​Start a group of services

​Start services individually

​Run Fusion in the foreground

​Run Fusion in shell mode

​Stop Fusion

​Using systemd to manage processes

​Launching Fusion at system start

​Starting and stopping Fusion

​Using Ubuntu Upstart to manage processes

​Windows

​Start Fusion

​Start required services

​Start services individually

​Run Fusion in the foreground

​Stop Fusion

​Run Fusion with a service account

​Access Fusion after startup

​Deployment goals

​Step through the Quickstart wizard

​Cluster Arrangements

​Prerequisites

​Configure Fusion to use an existing Solr deployment

​Use the Fusion UI

​Sending Documents to Solr through Fusion

​Querying Solr via Fusion requests

​Learn more

​Preliminary steps

​Nodes running core Fusion services and Solr also run ZooKeeper

​Nodes running ZooKeeper are not running core Fusion services or Solr

​Known issues

​Metrics collection failure

​Preliminary steps

​Nodes running core Fusion services and Solr also run ZooKeeper

​Nodes running ZooKeeper are not running core Fusion services or Solr

​Default Search Cluster

​Prerequisite

​Configure and Bootstrap Fusion

​Other SolrCloud Cluster

​Prerequisite

​Add the secure SolrCloud cluster in the Fusion UI (Basic Auth)

​Add the secure SolrCloud cluster in the Fusion UI (Kerberos)

Command summary

Define groups of services

How to modify the default list of services

How to define other lists of services (Unix)

Unix

Start Fusion

Start required services

Start a group of services

Start services individually

Run Fusion in the foreground

Run Fusion in shell mode

Stop Fusion

Using systemd to manage processes

Launching Fusion at system start

Starting and stopping Fusion

Using Ubuntu Upstart to manage processes

Windows

Start Fusion

Start required services

Start services individually

Run Fusion in the foreground

Stop Fusion

Run Fusion with a service account

Access Fusion after startup

Deployment goals

Step through the Quickstart wizard

Cluster Arrangements

Prerequisites

Configure Fusion to use an existing Solr deployment

Use the Fusion UI

Sending Documents to Solr through Fusion

Querying Solr via Fusion requests

Learn more

Preliminary steps

Nodes running core Fusion services and Solr also run ZooKeeper

Nodes running ZooKeeper are not running core Fusion services or Solr

Known issues

Metrics collection failure

Preliminary steps

Nodes running core Fusion services and Solr also run ZooKeeper

Nodes running ZooKeeper are not running core Fusion services or Solr

Default Search Cluster

Prerequisite

Configure and Bootstrap Fusion

Other SolrCloud Cluster

Prerequisite

Add the secure SolrCloud cluster in the Fusion UI (Basic Auth)

Add the secure SolrCloud cluster in the Fusion UI (Kerberos)