Getting Started with Fusion:
Part Two - Getting Data Out

In Part 1, we used the Index Workbench to get data into Fusion by previewing the dataset, configuring the index pipeline, and then indexing the data.

In Part 2, we’ll explore the Query Workbench and learn how to configure Fusion’s output, including faceting.

Facets are the ubiquitous, dynamic lists of categories or features offered as filters within a search results page. Facets provide a simple way for users to explore and filter their search results without having to construct complicated queries.

The data we indexed in Part 1 has two fields that are natural choices for faceting: genres_ss and year_ti. For example, an end user could search for science fiction of the 1940s in just two clicks.

The genres_ss field is ready for faceting as-is. But the year_ti field will be more usable if we configure range faceting. Range faceting is a way to group values together, for example by decade.

Before you begin

To proceed with this part of the tutorial, you must first complete Part 1, which gives you an indexed dataset for the Query Workbench to read.

The dataset has three fields that end users of our search application might find relevant:

  • genres_ss - a list of one or more genre labels.

  • title_txt - the name of the movie.

  • year_ti - the movie’s year of release.

The field suffixes indicate the type of data stored in each field:

  • Fields with suffix _ss contain one or more strings values ("multi-valued string field").

    String fields require an exact match between the query string and the string value stored in that field.

  • Fields with suffix _txt contain text.

    Text fields allow for free text search over the field contents. For example, because the movie titles are stored in a text field, a search on the word "Star" will match movies titled "Star", "A Star is Born", all movies in the Star Wars and Star Trek franchise, as well as "Dark Star" "Lone Star" and "Star Kid".

  • Fields with suffix _ti contain integer values ("trie int fields").

    Numeric fields allow range matches as well as exact matches, and "trie" fields allow for efficient comparisons between the field value and the search criteria.

The different field types allow for different kinds of searches. The query pipeline configuration determines how fields are searched.

1. Explore the default search results

The Query Workbench allows you to interactively configure a query pipeline while previewing the search results it produces. A query pipeline converts a free text query submitted to your search application into a structured query for Solr. Facets are configured as part of a query pipeline.

  1. Log in to Fusion, click Search, and make sure the ml-movies collection is selected.

  2. Navigate to Home > Query Workbench.

    Query Workbench

  3. Try searching the data to see the default output.

    The output is configured by the default query pipeline, named collection-name-default. A default query pipeline consists of these stages:

    • Boost with Signals - Tune search for specific use cases.

    • Query Fields - Specify the set of fields over which to search.

    • Field Facet - Specify the fields to use for faceting.

    • Solr Query - Perform the query and return the results.

      This is the only stage that is always required in order to perform a query and receive results.

  4. Turn off the Solr Query stage.

    All search results disappear from the preview pane because no query is sent to Fusion’s Solr core. This stage must be enabled in order to get search results.

  5. Turn the Solr Query stage on and turn all other stages off.

    Now the search results look much like they did before. At this point, the disabled stages do not affect the output because they are not yet configured.

2. Configure basic faceting

The default search is the wildcard search, which returns all documents in the collection. We’ll enter a different search query to get started with facet configuration.

  1. Enter the query string "star".

    This returns all movies which have the word "star" in the title.

    Search results

  2. Click Add a field facet and select the genres_ss field.

    Add facet

  3. Click Sci-Fi.

    Sci Fi

    The "+ Added to page" icons indicate that this set of search results is ordered differently than the previous search results.

Next we’ll configure a more sophisticated form of faceting.

3. Configure range faceting

Range faceting is a way of grouping values together so that the user can select a value range instead of one specific value. For example, range facets are commonly used with pricing ("$50-$100") or ratings ("4 stars or higher").

Range faceting requires sending an additional query parameter to Fusion’s Solr core. We can configure this with the Additional Query Parameters stage. In this case, we’ll use several of Solr’s range facet query parameters.

  1. QWB ml movies facets1Click Add a field facet and select the year_ti field.

    By default, facets are listed individually.

    We can see that this is not very useful; the level of granularity is too high. Faceting by decade would provide a simpler user experience.

  2. Click the X next to the year_ti facet to remove it from the facets column.

    When we selected it for faceting, Fusion added it to the Field Facet stage of the query pipeline. Instead, we will add it to the Additional Query Parameters stage for range faceting.

  3. Click Add a stage.

  4. Scroll down and select Additional Query Parameters.

    Additional Query Parameters

    The Additional Query Parameters configuration panel appears.

  5. Under Parameters and Values, add the following parameter names and values:

    • facet.range=year_ti

    • facet.range.start=1900

    • facet.range.end=2020

    • facet.range.gap=10

    • facet.range.include=outer

    In this case, you do not need to modify the Update Policy field; the default value of "append" is fine.

  6. Click Apply.

    The year facets are now grouped by decade:

    Year facets

Tip
In your final application, you can still provide an affordance for users to search for specific values in the year_ti field, using a text field, dropdown list, and so on.

4. Configure the query fields

Next we’ll see why it’s useful to specify which fields Fusion should use to match a query.

  1. Search for "2001".

    The results are not what an end user might expect:

    Unexpected results

    "2001: A Space Odyssey" is not the top search result.

  2. Under "Lethal Weapon 2", click show fields.

    QWB ml movies queryfields2

    Here’s the reason: our search query matches the id field. But of course our end users don’t care about this field.

    We’ll use the Query Fields stage to specify the fields that end users really care about.

  3. Click the Query Fields stage of the query pipeline.

    The Query Fields configuration panel appears.

  4. Under Search Fields, click the Add Add icon.

  5. Enter "title_txt".

  6. Click the Add icon again.

  7. Enter "year_ti".

  8. Click Apply.

    Now "2001: A Space Odyssey" rises to the top of our search results, followed by films made in the year 2001:

    Expected results

5. Save the query pipeline configuration

  1. Click Save.

    The Save Pipeline window appears. By default, you’ll overwrite the default pipeline for this datasource.

  2. Click Save pipeline.

  3. On the left side of the Query Workbench, click the Field Facet query pipeline stage.

    Notice that the genres_ss field facet is shown here. We used the Add a field facet link to configure this query pipeline stage, but you can also do it in this stage configuration panel.

    genres_ss facet

  4. Click Cancel to close the stage configuration panel.

What’s next

With just two facet fields combined with keyword search, this prototype is already beginning to feel like a real search application.

In Part 3, we’ll enable signals, generate some signal data, aggregate it, and search it to see what it looks like. Signals can be used for recommendations or boosting.