Migrate Milvus to Solr vectorsLucidworks AI
With Milvus being deprecated, these are the steps to move from Milvus to Solr vectors.
1. Add Solr vector schema
The first step to migration is to ensure the Solr schema for the collections has vector field definitions. If the collection was created in Fusion 5.9.5 or later, it will automatically have the vector schemas.
Vector field definitions to add:
<!-- Vector search fields -->
<dynamicField docValues="false" indexed="true" multiValued="false" name="*_64v" stored="true" type="knn_64_vector"/>
<dynamicField docValues="false" indexed="true" multiValued="false" name="*_128v" stored="true" type="knn_128_vector"/>
<dynamicField docValues="false" indexed="true" multiValued="false" name="*_256v" stored="true" type="knn_256_vector"/>
<dynamicField docValues="false" indexed="true" multiValued="false" name="*_384v" stored="true" type="knn_384_vector"/>
<dynamicField docValues="false" indexed="true" multiValued="false" name="*_512v" stored="true" type="knn_512_vector"/>
<dynamicField docValues="false" indexed="true" multiValued="false" name="*_768v" stored="true" type="knn_768_vector"/>
<dynamicField docValues="false" indexed="true" multiValued="false" name="*_1024v" stored="true" type="knn_1024_vector"/>
<!-- Field Types to support vector search -->
<fieldType class="solr.DenseVectorField" hnswBeamWidth="200" hnswMaxConnections="45" knnAlgorithm="hnsw" name="knn_64_vector" similarityFunction="cosine" vectorDimension="64"/>
<fieldType class="solr.DenseVectorField" hnswBeamWidth="200" hnswMaxConnections="45" knnAlgorithm="hnsw" name="knn_128_vector" similarityFunction="cosine" vectorDimension="128"/>
<fieldType class="solr.DenseVectorField" hnswBeamWidth="200" hnswMaxConnections="45" knnAlgorithm="hnsw" name="knn_256_vector" similarityFunction="cosine" vectorDimension="256"/>
<fieldType class="solr.DenseVectorField" hnswBeamWidth="200" hnswMaxConnections="45" knnAlgorithm="hnsw" name="knn_384_vector" similarityFunction="cosine" vectorDimension="384"/>
<fieldType class="solr.DenseVectorField" hnswBeamWidth="200" hnswMaxConnections="45" knnAlgorithm="hnsw" name="knn_512_vector" similarityFunction="cosine" vectorDimension="512"/>
<fieldType class="solr.DenseVectorField" hnswBeamWidth="200" hnswMaxConnections="45" knnAlgorithm="hnsw" name="knn_768_vector" similarityFunction="cosine" vectorDimension="768"/>
<fieldType class="solr.DenseVectorField" hnswBeamWidth="200" hnswMaxConnections="45" knnAlgorithm="hnsw" name="knn_1024_vector" similarityFunction="cosine" vectorDimension="1024"/>
If in your Create Collections in Milvus job you have a vector dimension not included above, be sure to add it.
2. Add Ray/Seldon Vectorize stage
In the index pipeline, instead of using the Encode into Milvus stage you will need to use Ray/Seldon Vectorize Field (in Fusion 5.9.11 and earlier, this stage is called Seldon Vectorize Field). The Milvus stage field Encode into Milvus maps to the Vectorize stage field Source Field, Encoder Output Vector to Model Output Vector Field, and the other fields map to the fields of the same name. The last key field to know is Destination Field, which is the Solr vector field name. The field can be named anything as long as the suffix matches the dimension size of the vector your model returns in your Solr vector field definitions. For example, for a Milvus collection dimension size of 384
, the Destination Field should use the suffix _384v
.
At this point you can start indexing.
On the query side, there is a Ray/Seldon Vectorize Field stage which will be set up mostly the same as on the index side. The only difference is there is no need to put in the vector field name, as it is done in the next step.
3. Add hybrid stage
The final step before you are finally migrated over to Solr vectors is adding a variation of Fusion Neural Hybrid Query or Managed Fusion Hybrid Query stage (known as Neural Hybrid Query stage in Managed Fusion 5.9.10 and later). In this stage, all you need to do is put in the Vector Query Field you have defined in your index pipeline.
Make sure the hybrid stage is placed before any boosting, Apply Rules, or Security Trimming stages.
At this point you can query. After determining the query works you can tune your set up before removing your Milvus collections and pipelines.