See also these subtopics:
Apache Spark is an open-source cluster-computing framework that serves as a fast and general execution engine for large-scale data processing jobs that can be decomposed into stepwise tasks, which are distributed across a cluster of networked computers.
Spark improves on previous MapReduce implementations by using resilient distributed datasets (RDDs), a distributed memory abstraction that lets programmers perform in-memory computations on large clusters in a fault-tolerant manner.
Fusion manages a Spark cluster that is used for all signal aggregation processes.
See Machine Learning Jobs for details about each pre-defined machine learning job in Fusion.
To schedule and run jobs on the nodes in the cluster, Spark uses Akka which is a toolkit and runtime for building highly concurrent, distributed, and resilient message-driven applications on the JVM.