Cold Start Solution

When no FAQ exists for training, the cold start solution uses our word vector (Word2vec) training module in the Docker image to learn about the vocabulary in the search results. Then it uses our provided query pipeline to combine Solr and document vector similarity scores at query time.

We suggest capturing signals from document clicks, likes, and downloads. These signals form Q&A pairs. After cumulating at least 3,000 Q&A pairs, that feedback can be used as training data for the FAQ solution.

Like the FAQ solution, the cold start solution has two parts:

1. Prepare a model

There are two ways to do this:

2. Configure the pipelines

The trained model should be used at both index and query time in order to perform dense vector search.

  • At index time, we provide the {app_name}_question_answering index pipeline to help generate a dense vector representation of answers.

  • At query time, we provide a {app_name}_question_answering query pipeline to conduct run-time neural search. This pipeline transforms the incoming query into a dense vector using the trained model, then compares it with indexed answer dense vectors by computing the cosine distance between them. You can also use a query stage to combine Solr and document vector similarity scores at query time.

See Configure The Smart Answers Pipelines. Once your pipelines are configured, see Evaluate a Smart Answers Query Pipeline to test its effectiveness so that you can fine-tune the configuration.

Short answer extraction

By default, the question-answering query pipelines return complete documents that answer questions. Optionally, you can extract just a paragraph, a sentence, or a few words that answer the question. See Configure Short Answer Extraction.