The Fusion machine learning indexing stage uses a trained machine learning model to analyze a field or fields of a PipelineDocument and stores the results of analysis in a new field of either the PipelineDocument or Context object.
In order to use the Machine Learning Stage, you must train a machine learning model. There are two different ways to train a model:
-
Use a Fusion AI job that trains a model, like Logistic Regression or Random Forest.
-
Train a model using Spark’s MLlib API outside of Fusion, and upload this model into Fusion’s blob store. Complete details are available in Machine Learning Models in Fusion.
Tip
|
When specifying field names, multiple field names are supported, in this format: field1:weight,field2:weight,field3:weight
|