Part Three - Better Search
- Working With A Query Pipeline
- Search Tuning
- Embedding Search
- Lessons Learned
Before you begin, be sure to complete Part One and Part Two.
In Part Three, we’ll modify the default query pipeline to exploit our enriched documents by manipulating the relative weights of their fields. The overall relevancy ranking for a document is derived from the per-field search scores. We’ll use Fusion to modify the formula used to combine these constituent per-field scores in order to up- or down-weight the contribution of a per-field score to the overall result.
Working With A Query Pipeline
A search form is the usual interface between your search application and Fusion. The inputs from the search form are submitted to a Fusion query pipeline in the form of an HTTP request which returns a payload containing a structured search result.
Like an index pipeline, a query pipeline is composed of processing stages. The stages in a query pipeline transform a set of inputs into a Solr query that runs against the Solr collection and returns the result.
Navigate to Home > Query Pipelines.
Click the default query pipeline for this collection, "cinema_1-default".
This opens the Query Pipeline configuration panel, showing the initial configuration for "cinema_1-default":
A default query pipeline consists of three stages:
A Query Fields query stage defines common Solr query parameters.
A Facets query stage contains no specified facet fields by default.
A Solr Query stage sends the fully-configured query request to Solr.
We’ll explore all three of these stages.
Solr Fields query stage
The Solr Fields query stage is used to specify which fields are used for search and which fields are returned as part of a search results document:
In order to run free-text search queries over a field, that field must be indexed as a text field. The Fusion field naming convention uses the suffix "_txt" to indicate that a field should be indexed as a text field.
Collection "cinema_1" has two text fields available for free-text search:
We’ll start by configuring the Solr Fields query stage to specify that the
title_txt fields should be used for search.
In the Query Pipeline configuration panel, click Query Fields.
Under Query Fields, click the green add (+) button and add "shortAbstract_txt".
For now, leave the Field Boost unspecified, for this field and the next one.
Click the green add (+) button again and add "title_txt".
Under Return Fields, add the following fields:
For the return fields we specify all input fields and the title field, plus two identifying fields:
The default ID created by the CSV processor
Solr’s internal "version" id
Facet query stage
Faceting provides an open-ended way of slicing and dicing a set of search results based on category information available from certain fields in the documents in the results set. Any field which encodes information about item attributes, such as type, category, location, price, size, shape, date, and so on, can be used for faceting. Because this powerful feature is commonly used, it’s included as part of the default pipeline associated with every collection.
However, the dataset for this tutorial doesn’t have any fields which contain this kind of information, so instead of configuring the facet field stage, we’ll disable it for faster query processing.
Click the Facets stage.
Click the Skip This Stage checkbox.
Click the Save button.
Solr query stage
This stage submits a query to Solr. No special configuration is needed for this stage:
To see how this pipeline works, we return to the Search UI.
In the Query Pipelines configuration panel, click the plus (+) icon in the upper right.
The Home panel appears on the right.
In the Home panel, click Search.
The Search panel opens next to the Query Pipelines panel. Now we can work with both panels.
Our search works as expected. Later we’ll compare these results with the ones we get after search tuning.
Next we’ll test our query pipeline with a longer, more open-ended free-text search.
Search for "film starring Matt Damon".
Our results include titles with "Matt" and "Damon".
In the Query Pipelines panel, under Query Fields, give
abstractShort_txta Field Boost value of "2" and click Save.
Our search results improve a bit. Let’s see what happens when we boost that field again.
abstractShort_txta Field Boost value of "3" and click Save.
Compare these results to the ones we got before search tuning.
In order to embed search in an application, we need the query URL.
In the Search panel, click the gear icon to open the configuration window.
Select Display Query URL.
Now the search URL is displayed near the top of the panel. This is what you will embed in your app.
However, the URL displayed here includes extra parameters that are required by the controls on the Search UI. For example:
Copy the URL and paste it into another browser tab. Remove all parameters except
q=star+wars, like this:
Our results are in XML format because this is the default writer type, controlled by the
wt=jsonto the query URL and load the results again:
Your application can request search results in an appropriate format. See the complete list of values for the
There are simple, principled ways to modify field boost based on what we already know about fields and their contents.
We can use the search query itself to improve search results.
For basic searching, use fields and field boosting to provide the best results, back-off when users are unsure.
Richer data makes for richer search experiences.