Experiments
When making changes to a query pipeline or query parameters that will affect users' search experience, it is often a good idea to run an experiment in order to verify that the results are what you intended. Fusion AI lets you create and run experiments that take care of dividing traffic between variants and calculating the results of each variant with respect to configurable objectives such as purchases, click-through rate or search relevance.
There are 2 ways that a search application might interact with an experiment: using a query profile, or using an Experiment query pipeline stage.
If a query profile is configured to use an experiment, then a search app sends queries and signals to the query profile endpoint. If the experiment is active, then Fusion routes each query through one of the experiment variants. The search app will also send subsequent signal data relating to that query — clicks, purchases, "likes", or whatever is relevant to the application — to that same query profile, and Fusion will record it along with information about the experiment variant that the user was exposed to. Fusion generates and stores the data that metrics calculations use. Metrics jobs periodically calculate the metrics. After metrics have been calculated, they are available in App Insights.
This topic explains the experiment workflow and basic concepts. These additional topics provide details about how to implement experiments to improve the user experience:
A/B/n experiments
Fusion AI lets you create and run experiments to compare different search experiences with respect to some objectives such as purchases or click-through rate (CTR).
Experiments features in Fusion AI are A/B/n experiments (also called A/B experiments).
Example
The following experiment is an example of an A/B/n experiment with three variants:
-
Variant 1 (control) – Use the default query pipeline with no modifications. Each experiment should have a "control" variant as the first variant; the other variants will be compared against this one.
-
Variant 2 (content-based filtering with a Recommend More Like This stage) – Content-based filtering uses data about a user’s search results, browsing history, and/or purchase history to determine which content to serve to the user. The filtering is non-collaborative.
-
Variant 3 (collaborative filtering with a Recommend Items for User stage) – Collaborative filtering takes advantage of knowledge about the behavior of many individuals. It makes serendipitous discovery possible—a user is presented with items that other users deem relevant, for example, socks when buying shoes.
High-level workflow
In an experiment:
-
A Fusion administrator defines the experiment. An experiment has variants with differences in query pipelines, query pipeline stages, collections, and/or query parameters.
-
The Fusion administrator assigns the experiment to a query profile.
-
A user searches using that query profile.
-
If the experiment is running, Fusion assigns the user to one of the experiment variants, in accordance with traffic weights. Assignment to a variant is persistent. The next time the user searches, Fusion assigns the same variant.
-
Different experiment variants return different search results to users.
-
Users interact with the search results, for example, viewing them, possibly clicking on specific results, possibly buying things, and so forth.
-
Based on the interactions, the search app backend sends signals to the signals endpoint of the query profile for the experiment.
-
Using signal data, a Metrics Spark job periodically computes metrics for each experiment variant and writes the metrics to the
_signals_aggr
collection. -
In the Fusion UI, an administrator can use App Insights to view reports about the experiment.
-
Once the results of the experiment are conclusive, the Fusion administrator can stop the experiment and change the query profile to use the winning variant, or start a new experiment.
Information flow
This diagram illustrates information flow through an experiment. Numbers correspond to explanations below the diagram.
-
A user searches in a search app. For example, the user might search for
shirt
. -
The search app backend appends a
userId
parameter that identifies the user, for example,userId=123
, to the query and sends the query to the query profile endpoint for the experiment. -
Using information in the query profile and the value of the
userId
, Fusion routes the query through one of the experiment’s variants. In this example, Fusion routes the query through query pipeline 1. -
A query pipeline adds a
x-fusion-query-id
to the response header, for example,x-fusion-query-id=abc
. -
Based on the query, Fusion obtains a search result from the index, which is stored in the primary collection. Fusion sends the search result back to the search app.
-
Fusion sends a response signal to the signals collection.
-
A different user might be routed through the other experiment variant shown here, and through query pipeline 2. This query pipeline has an enabled Boost with Signals stage, unlike query pipeline 1.
-
The search user interacts with the search results, viewing them, possibly clicking on specific results, possibly buying things, and so forth. For example, the user might click the document with
docId=757
. -
Based on the interactions, the search app backend sends click signals to the signals endpoint for the query profile. Signals include the same query ID so Fusion can associate the signals with the experiment.
-
Using information in the query profile, Fusion routes the signals to the
_signals_ingest
pipeline. -
The
_signals_ingest
pipeline stores signals in the_signals
collection. Signals include the collection ID of the primary collection and experiment tracking information.
Metrics generation
This diagram illustrates metrics generation:
-
A Fusion administrator can configure which metrics are relevant for a given experiment and the frequency with which experiment metrics are generated. They can also generate metrics on demand.
-
Using signal data, a Metrics Spark job periodically runs in the background. It obtains signal data from the
_signals
collection, computes metrics for each experiment variant, and writes the metrics to the_signals_aggr
collection. -
In the Fusion UI, a Fusion administrator can view experiment metrics.
-
App Insights uses these calculated metrics and displays reports about the experiment.