The Installation Guide will show you how to install Solr 5.5.1, content connectors, and Banana.

System Requirements

In general, the requirements to run HDP 2.5.x are sufficient to run Solr and all of the Hadoop ecosystem integrations, with a few exceptions. Before installation, please be sure to review the following requirements and supported versions.

Operating Systems

HDP Search is supported on the following operating systems:

  • 64-bit CentOS 6 & 7

  • 64-bit Red Hat Enterprise Linux (RHEL) 6 & 7

  • 64-bit Oracle Linux 6 & 7

  • 64-bit SUSE Linux Enterprise Server (SLES) 11, SP3/SP4

  • 64-bit Debian 7

  • 64-bit Ubuntu 12 & 14

Java

HDP Search requires Oracle or OpenJDK Java 1.7 or higher.

HDP 2.5.x

This documentation assumes you have already installed HDP 2.5.x, or are installing it at the same time.

Solr & Hadoop JARs

Solr

As mentioned above, Solr 5.5.1 supports Java JDK 1.7 and higher.

Additionally, you will need enough RAM and disk space to store the indexes. Memory and disk requirements will vary depending on your use case, but should be considered carefully prior to production deployment.

Solr by default uses the port 8983 on each node where it is installed. This can be customized if necessary.

Apache ZooKeeper

HDP 2.5.x and Solr both use Apache ZooKeeper to manage services for the cluster. The ZooKeeper ensemble that you are using for HDP 2.5.x can also be used by Solr. Installation of Solr with Ambari will automatically connect Solr to the running ZooKeeper ensemble.

When using a shared ZooKeeper instance, it’s recommended to isolate Solr’s znode from other znodes running. When starting Solr, we’ll pass a chroot to ZooKeeper to store Solr’s configuration files. No additional setup or modification of your ZooKeeper instance should be required.

Hadoop JARs

The JARs for indexing content from HDFS, Pig, Hive, Storm and Spark do not have any additional system requirements.

In some cases, the JARs will need to be moved to HDFS or loaded to the classpath in order to be used. See the documentation for each JAR for detailed information.

HDP 2.5.x Package Contents

The HDP Search package includes:

  • Solr 5.5.1

  • Banana 1.6.0

  • JARs for integration with Hadoop, Hive, HBase and Pig

  • Software development kits for Storm and Spark

Solr should be installed on each node that runs HDFS. In many cases the connector JARs are loaded to the classpath of their respective applications and usually do not need to be copied to multiple nodes of the cluster.

HDP Search Directory Layout

After installation, HDP 2.5.x files and directories are located in /opt/lucidworks-hdpsearch. You will see several directories and files:

  • docs: HDP Search documentation as HTML files.

  • hbase-indexer: This directory includes the hbase-indexer, which provides near real-time indexing of content from HBase to Solr.

  • hive: This directory includes the Hive SerDe, to support integration of Hive and Solr.

  • job: This directory includes a Hadoop job jar which includes code to index a Hadoop filesystem to Solr.

  • pig: This directory includes Pig functions for using Pig scripts to index content to Solr.

  • solr: This directory includes the full distribution of Solr 5.5.1. Banana is pre-installed with Solr.

  • spark: This directory includes a clone of the https://github.com/Lucidworks/spark-solr repository in Github, for building a Spark application.

  • storm: This directory includes a clone of the https://github.com/Lucidworks/storm-solr repository in Github, for building a Storm bolt and topology.

Tested & Supported Versions

The following versions were used during testing of HDP 2.5.x.

Application Version

Hortonworks Data Platform

2.5.0

Apache Hadoop

2.7.1

Apache HBase

1.1.2

Apache Hive

1.2.1

Apache Pig

0.15.0

Apache Solr

5.5.1

Apache Spark

1.6.1

Apache Storm

0.9.4

Installation

HDP 2.5.x can be installed with Ambari or manually.

Ambari Installation

Installation with Ambari provides a wizard-style installation process to allow you to install HDP 2.5.x on one or more nodes of your cluster. For more details, see the page Ambari Installation Guide.

Manual Installation

If you do not use Ambari, you can install HDP 2.5.x, but you will need to do so on each node of your cluster. For detailed steps, see the page Manual Installation Guide.

Running Banana

Banana is a tool to visualize data you have stored in Solr through saved dashboards. Banana is included with the Lucidworks Job Jar package and runs with the Solr JVM.

Installing Banana

Banana runs inside Solr’s JVM and uses the same Jetty process, so it is accessed at the same port used by Solr. This also means that if Solr is offline, Banana will also be offline.

During installation with the HDP 2.5.x RPM package, Banana is automatically installed to the /opt/lucidworks-hdpsearch/solr/server/solr-webapp/webapp/banana directory.

Accessing Banana Dashboards

Banana runs via Solr’s webapp, so you access the Banana dashboards at the same host and port as the Solr Admin UI (the default is localhost:8983). In a default Solr installation (e.g., with the default port), the URL to access Banana is http://localhost:8983/solr/banana/index.html#/dashboard.