Train a Smart Answers cold start model
Train a Smart Answers cold start model
The Smart Answers Cold Start Training job is deprecated in Fusion 5.12.
Smart Answers comes with two pre-trained cold-start models. If your data does not have many domain-specific words, then consider using a pre-trained model.
Configure the training job
- In Fusion, navigate to Collections > Jobs.
- Select Add > Smart Answer Coldstart Training.
- In the Training Collection field, specify the collection that contains the content that can be used to answer questions.
- Enter the name of the Field which contains the content documents.
- Enter a Model Deployment Name. The new machine learning model is saved in the blob store with this name. You will reference it later when you configure your pipelines.
-
Configure the Model base.
There are several pre-trained word and BPE embeddings for different languages, as well as a few pre-trained BERT models.
If you want to train custom embeddings, please select
word_custom
orbpe_custom
. This trains Word2vec on the data and fields specified in Training collection and Field which contains the content documents. It might be useful in cases when your content includes unusual or domain-specific vocabulary. When you use the pre-trained embeddings, the log shows the percentage of processed vocabulary words. If this value is high, then try using custom embeddings. During the training job analyzes the content data to select weights for each of the words. The result model performs the weighted average of word embeddings to obtain final single dense vector for the content. -
Click Save.
If using solr as the training data source ensure that the source collection contains the
random_*
dynamic field defined in itsmanaged-schema.xml
. This field is required for sampling the data. If it is not present, add the following entry to themanaged-schema.xml
alongside other dynamic fields<dynamicField name="random_*" type="random"/>
and <fieldType class=“solr.RandomSortField” indexed=“true” name=“random”/> alongside other field types. - Click Run > Start.
Next steps
- Configure The Smart Answers Pipelines
- Evaluate a Smart Answers Query Pipeline