Machine Learning

Table of Contents

Machine learning with Spark
More information

Machine learning with Spark

Apache Spark is an open source cluster-computing framework that serves as a fast and general execution engine for large-scale data processing jobs that can be decomposed into stepwise tasks, which are distributed across a cluster of networked computers.

Spark improves on previous MapReduce implementations by using resilient distributed datasets (RDDs), a distributed memory abstraction that lets programmers perform in-memory computations on large clusters in a fault-tolerant manner.

Fusion manages a Spark cluster that is used for all signal aggregation processes.

With a Fusion license, you can also use the Spark cluster to train and compile machine learning models, as well as to run experiments via the Fusion UI or the Spark Jobs API.

See Machine Learning Jobs for details about each pre-defined machine learning job in Fusion.

More information

Lucidworks offers free training to help you get started.

The Course for Intro to Machine Learning in Fusion focuses on using machine learning to to infer the goals of customers and users in order to deliver a more sophisticated search experience:

Visit the LucidAcademy to see the full training catalog.