- Number of shards. Documents are distributed across this number of partitions.
- Document routing strategy. How documents are assigned to shards.
- Replication factor. How many copies of each document in the collection.
- Replica placement strategy. Where to place replicas in the cluster.
Collection names are case-insensitive, but Fusion preserves case when displaying collection names.
Auxiliary Collections
Every primary collection is associated with a set of auxiliary collections that contain related data, such as signals, aggregations, and more. Some auxiliary collections are created for every primary collection. Others are created only for the app’s default collection, one per app. Auxiliary collections are described below:APP_NAME_job_reports | Output from Fusion experiments, Ranking Metrics jobs, and Head/Tail Analysis jobs. | 1 per app |
APP_NAME_query_rewrite | A collection of documents to use for rewriting queries, optimized for high-volume traffic. These documents originate from the COLLECTION_NAME_query_rewrite_staging collection. Certain Fusion query pipeline stages read from this collection: ● Text Tagger ● Apply Rules ● Modify Response with Rules | 1 per app |
APP_NAME_query_rewrite_staging | A collection of documents created by the Rules Editor or by certain Fusion jobs, not optimized for production traffic. Documents move from this collection to the COLLECTION_NAME_query_rewrite collection as follows: ● Job output documents with high confidence contain a review=auto field and are moved to the COLLECTION_NAME_query_rewrite collection automatically.● Job output documents with low confidence contain a review=pending field. When these are approved by a Fusion user, Fusion copies them to the COLLECTION_NAME_query_rewrite collection. | 1 per app |
COLLECTION_NAME_signals | A search query logs and signals collection. | 1 per collection |
COLLECTION_NAME_signals_aggr | A collection for aggregated signals. | 1 per collection |
APP_NAME_user_prefs | A collection of data to support App Studio’s social features, such as user-generated tags, bookmarks, comments, ratings, and so on. | 1 per app |
Don’t create primary collections with names that end in the suffixes above; these are reserved for Fusion auxiliary collections, which are created and managed by Fusion directly.
Don’t create primary collections named “logs” or beginning with “system_”.
These names are reserved for Fusion system collections.
- Datasources
- Pipelines
- Profiles
- Signals and aggregations
- Analytics dashboards
System Collections
Fusion automatically creates some collections that are used for internal purposes and shared across all apps:- system_autocomplete stores the content that the Fusion UI displays when you use the search bar.
- system_blobs stores blobs in Solr. This is used to store model files for the NLP components and other binary files used by Fusion components.
- system_history keeps a record of configuration changes, start and stop times for services and experiments, and more.
- system_jobs_history keeps a record of Fusion jobs, including start/stop times and status.
Collection Configuration Properties
Collections have three properties that you can configure only when you are creating a collection using the Collections API.Property | Description | Default behavior |
---|---|---|
signals* | The signals property determines whether to create auxiliary collections with suffixes _signals and _signals_aggr . | When you create a collection in the Fusion UI, signals defaults to true. When you create a collection using the Fusion API, this property defaults to false. |
searchLogs | The searchLogs property determines whether to create an auxiliary search query logs collection with suffix _logs . | When you create a collection in the Fusion UI, this property defaults to true. When you create a collection using the Fusion API, this property defaults to false. |
Using profiles to associate collections with pipelines
Index pipelines and query pipelines aren’t connected to a specific collection by default. Index profiles and query profiles are configurations that create consistent endpoints for indexing and querying, each with a specific pipeline and collection.- Index Profiles work with index pipelines for getting content into the system.
- Query Profiles work with query pipelines for user queries.
Field Editor UI
The Fields Editor UI allows you to create and configure the schema file directly from Fusion. For instructions, see Fields Editor UI.Additional resources
LucidAcademyLucidworks offers free training to help you get started.The Course for Fusion Applications and Collections focuses on how Fusion transforms your siloed data into personalized insights unique to each user:Visit the LucidAcademy to see the full training catalog.