Neural Hybrid Search
Neural Hybrid Search is a capability that combines lexical and semantic dense vector search to produce more accurate and relevant search results.
This feature is currently only available to clients who have contracted with Lucidworks for features related to Neural Hybrid Search and Lucidworks AI. |
This feature is only available in Fusion 5.9.5 and later versions of Fusion 5.9. |
Overview
Lexical search works by looking for literal matches of keywords. For example, a query for chips
would result in potato chips and tortilla chips, but it could also result in chocolate chips. Semantic vector search, however, imports meaning. Semantic search could serve up results for potato chips, as well as other salty snacks like dried seaweed or cheddar crackers. Both methods have their advantages, and often you’ll want one or the other depending on your use case or search query. Neural Hybrid Search lets you use both: it combines the precision of lexical search with the nuance of semantic search.
Hybrid Scoring
The combination of lexical and semantic score is based on this function:
(vector_weight*vector_score + lexical_weight*scaled(lexical_score))
Because lexical scores can be arbitrarily large due to the use of TF-IDF and BM25, scaled()
means that the lexical scores are scaled close to 0
and 1
to be aligned with the bounded vector scores. This scaling of 1
is achieved by taking the largest lexical score and dividing all lexical scores by that high score.
-
For highly tuned lexical and semantic search, the ratio will be closer to
0.3
lexical weight and0.7
semantic weight. -
When using the Boost with Signals stage use
bq
, notboost
, and enable Scale Boosts to control how much the signals can impact the overall hybrid score. Lucidworks recommends keeping the scale boost values low, since SVS with scale scores with a max of1
.
For more information, see Semantic vector search test guidelines.
KNN Solr Scoring
Solr supports three different similarity score metrics: euclidean
, dot_product
or cosine
. In Fusion, the default is cosine
. It’s important to note that Lucene bounds cosine to 0
to 1
, and therefore differs from standard cosine similarity. For more information, refer to the Lucene documentation on scoring formula and the Solr documentation on Dense Vector Search.
Replica choice
Lucidworks recommends using PULL and TLOG replicas. These replica types copy the index of the leader replica, which results in the same HNSW graph on every replica. When querying, the HNSW approximation query will be consistent given a static index.
In contrast, NRT replicas have their own index, so they will also have their own HNWS graph. HNSW is an Approximate Nearest Neighbor (ANN) algorithm, so it will not return exactly the same results for differently constructed graphs. This means that queries performed can and will return different results per HNWS graph (number of NRT replicas in a shard) which can lead to noticeable result shifts. When using NRT replicas, the shifts can be made less noticeable by increasing the topK
parameter. Variation will still occur, but it should be lower in the documents. Another way to mitigate shifts is to use Neural Hybrid Search with a vector similarity cutoff.
For more information, refer to Solr Types of Replicas.
Considerations for multi-sharded collections
-
The Fusion UI will show vectors floats encapsulated by
“ ”
. This is expected behavior. -
Sharding with
topK
pullsK
from each shardtopK*Shard_count
.