Machine Learning Jobs

Fusion AI provides these job types to perform machine learning tasks.

Signals analysis

These jobs analyze a collection of signals in order to perform query rewriting, signals aggregation, or experiment analysis.

Ground Truth
Estimate ground truth queries using click signals and query signals, with document relevance per query determined using a click/skip formula.

Query rewriting

These jobs produce data that can be used for query rewriting or to inform updates to the synonyms.txt file.

Head/Tail Analysis
Perform head/tail analysis of queries from collections of raw or aggregated signals, to identify underperforming queries and the reasons. This information is valuable for improving overall conversions, Solr configurations, auto-suggest, product catalogs, and SEO/SEM strategies, in order to improve conversion rates.
Phrase Extraction
Identify multi-word phrases in signals.
Synonym and Similar Queries Detection Jobs
Use this job to generate pairs of synonyms and pairs of similar queries. Two words are considered potential synonyms when they are used in a similar context in similar queries.
Token and Phrase Spell Correction
Detect misspellings in queries or documents using the numbers of occurrences of words and phrases.

Signals aggregation

Parameterized SQL Aggregation
A Spark SQL aggregation job where user-defined parameters are injected into a built-in SQL template at runtime.

Experiment analysis

Ranking Metrics
Calculate relevance metrics (nDCG and so on) by replaying ground truth queries against catalog data using variants from an experiment.
SQL-Based Experiment Metric (deprecated)
This job is created by an experiment in order to calculate an objective.
SQL-Based Experiment Metric job is deprecated as of Fusion AI 4.0.2.

Collaborative recommenders

These jobs analyze signals and generate matrices used to provide collaborative recommendations.

ALS Recommender
Use this job when you want to compute user recommendations or item similarities using a collaborative filtering recommender. You can also implement a user-to-item recommender in the advanced section of this job’s configuration UI. This job uses SparkML’s Alternating Least Squares (ALS).
Query-to-Query Similarity
Train a collaborative filtering matrix decomposition recommender using SparkML’s Alternating Least Squares (ALS) to batch-compute query-query similarities. This can be used for items-for-query recommendations as well as queries-for-query recommendations.

Content-based recommenders

Content-based recommenders create matrices of similar items based on their content.

Content analysis

Cluster Labeling
Use this job when you already have clusters or well-defined document categories, and you want to discover and attach keywords to see representative words within those existing clusters. (If you want to create new clusters, use the Document Clustering job.)
Collection Analysis
Use this job when you want to compute basic metrics about your collection, like average word length, phrase percentages, and outlier documents (with very many or very few documents).
Document Clustering
Cluster a set of documents and attach cluster labels.
Logistic Regression Classifier Training
Train a regularized logistic regression model for text classification.
Outlier Detection
Use this job when you want to find outliers from a set of documents and attach labels for each outlier group.
Random Forest Classifier Training (deprecated)
Train a random forest classifier for text classification.
Word2Vec Model Training (Deprecated)
Train a shallow neural model, and project each document onto this vector embedding space.

Data ingest

Parallel Bulk Loader
The Parallel Bulk Loader (PBL) job enables bulk ingestion of structured and semi-structured data from big data systems, NoSQL databases, and common file formats like Parquet and Avro.

Legacy machine learning jobs

Legacy Item Recommender
Compute user recommendations based on a pre-computed item similarity model.
Legacy Item Similarity
Use this job when you only want to compute item-to-item similarities. This method is more lightweight than the generic Recommendations job.
Legacy Item Similarity job is deprecated as of Fusion AI 4.1.0. Use the ALS recommender job instead.

Fusion Server

Fusion AI

App Studio

Signals analysis

Query rewriting

Signals aggregation

Experiment analysis

Collaborative recommenders

Content-based recommenders

Content analysis

Data ingest

Legacy machine learning jobs

Fusion Server

Fusion AI

App Studio

​Signals analysis

​Query rewriting

​Signals aggregation

​Experiment analysis

​Collaborative recommenders

​Content-based recommenders

​Content analysis

​Data ingest

​Legacy machine learning jobs

Signals analysis

Query rewriting

Signals aggregation

Experiment analysis

Collaborative recommenders

Content-based recommenders

Content analysis

Data ingest

Legacy machine learning jobs