> ## Documentation Index
> Fetch the complete documentation index at: https://doc.lucidworks.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Machine Learning

export const LwTemplate = ({title = "Key questions to get you started", icon = "sparkles", cta = "Powered by Agent Studio", linkHref = "https://lucidworks.com/demo/?utm_source=docs&utm_medium=referral&utm_campaign=docs_cta_ai"}) => {
  const [isLoaded, setIsLoaded] = useState(false);
  useEffect(() => {
    const timer = setTimeout(() => {
      setIsLoaded(true);
    }, 500);
    return () => clearTimeout(timer);
  }, []);
  return <div className="lw-template-container">
      <Card title={title} icon={icon}>
        {isLoaded && <span dangerouslySetInnerHTML={{
    __html: `<lw-template id="a029c1a9-28be-427e-b0e1-5d918920246a"></lw-template
            >`
  }} />}
        <Link href={linkHref} className="agent-studio-link text-left text-gray-600 gap-2 dark:text-gray-400 text-sm font-medium flex flex-row items-center hover:text-primary dark:hover:text-primary-light group-hover:text-primary group-hover:dark:text-primary-light">Powered by Lucidworks Agent Studio</Link>
      </Card>
    </div>;
};

[localhost link]: http://localhost:3000/docs/5/fusion/intro/machine-learning/overview

[mintlify link]: https://doc.lucidworks.com/docs/5/fusion/intro/machine-learning/overview

[old doc.lw link]: https://doc.lucidworks.com/fusion/5.9/495

<LwTemplate />

## Machine learning with Spark

[Apache Spark](http://spark.apache.org/) is an open source cluster-computing framework that serves as a fast and general execution engine for large-scale data processing jobs that can be decomposed into stepwise tasks, which are distributed across a cluster of networked computers.

Spark improves on previous MapReduce implementations by using resilient distributed datasets (RDDs), a distributed memory abstraction that lets programmers perform in-memory computations on large clusters in a fault-tolerant manner.

Fusion manages a Spark cluster that is used for all [signal aggregation](/docs/5/fusion/reference/config-ref/jobs/aggregations/overview) processes.

With a Fusion license, you can also use the Spark cluster to [train and compile machine learning models](/docs/5/fusion/intro/machine-learning/machine-learning-models), as well as to run experiments via the [Fusion UI](/docs/5/fusion/getting-data-out/data-analytics/experiments/overview) or the [Spark Jobs API](/api-reference/spark-job-controller-api/list-all-job-runs).

See [Machine Learning Jobs](/docs/5/fusion/intro/machine-learning/ml-jobs) for details about each pre-defined machine learning job in Fusion.

## More information

* [Apache Spark Key Terms, Explained](https://databricks.com/blog/2016/06/22/apache-spark-key-terms-explained.html)
* [Apache Spark on Wikipedia](https://en.wikipedia.org/wiki/Apache_Spark)
* [Machine Learning Models in Fusion](/docs/5/fusion/intro/machine-learning/machine-learning-models)
* [Machine Learning Jobs](/docs/5/fusion/intro/machine-learning/ml-jobs)
* [Spark Administration in Kubernetes](/docs/5/fusion/operations/survival-guide/spark-kubernetes-overview)

<Card title="Intro to Machine Learning in Fusion" class="note-image" href="https://academy.lucidworks.com/intro-to-machine-learning-in-fusion" cta="Take this course on the LucidAcademy." icon="graduation-cap" iconType="duotone">
  The course for **Intro to Machine Learning in Fusion** focuses on using machine learning to infer the goals of customers and users in order to deliver a more sophisticated search experience.
</Card>
