Fusion 5.8.0 - Lucidworks documentation

Release date: March 22, 2023 Component versions:

Component	Version
Solr	fusion-solr 5.8.0 (based on Solr 9.1.1)
ZooKeeper	3.7.1
Spark	3.2.2
Kubernetes	GKE, AKS, EKS 1.24 Rancher (RKE) and OpenShift 4 compatible with Kubernetes 1.24 OpenStack and customized Kubernetes installs not supported. See Kubernetes support for end of support dates.
Ingress Controllers	Nginx, Ambassador (Envoy), GKE Ingress Controller Istio not supported.

More information about support dates can be found at Lucidworks Fusion Product Lifecycle.

Looking to upgrade?See Fusion 5 Upgrades for detailed instructions.

Rosette Entity Extractor (REX) or Rosette Base Linguistics (RBL) for the Use Advanced Linguistics with Babel Street are not compatible with Solr 9 included in this version of Fusion. If you rely on the Babel Street language module, do not upgrade until this compatibility issue is resolved.

Use Advanced Linguistics with Babel Street

The Fusion Advanced Linguistics Package embeds Babel Street’s (formerly Basistech) Rosette natural language processing tools for multilingual text analysis. To improve search recall, Rosette Base Linguistics (RBL) handles the unique linguistic phenomena of more than 30 Asian and European languages. Rosette Entity Extractor (REX) identifies named entities such as people, locations, and organizations, allowing you to quickly refine your search, remove noise, and increase search relevance.

Using Named Entities (REX)

REX extracts named entities in multiple languages, including English, Chinese (traditional and simplified), and German. In English, it extracts multiple entity types and subtypes. This includes the following entity types (along with their associated subtypes):

PERSON
LOCATION
ORGANIZATION
PRODUCT
TITLE
NATIONALITY
RELIGION

In this tutorial, we will extract named entities from English news articles.

Create Application

To begin, create a new application called “entities”.

Configuration

Edit Solr Configuration

We will begin by adding the Basis library elements to the solrconfig.xml file. We will also add a new update processor to perform the entity extraction.

Navigate to System > Solr Config to edit the solrconfig.xml file.
Fusion 5.8 and earlier: In the <lib/> directive section, add the lines below. Fusion 5.9 and later already contain these lines.
```
<lib dir="/opt/basistech/rex-je/lib" regex=".*\.jar" />
<lib dir="/opt/basistech/solr/lib/" regex=".*\.jar" />
```
For Fusion 4.x users, the dir paths are the local REX installation path.

In the <updateRequestProcessorChain/> section, add the following lines after the existing processor chains:

<updateRequestProcessorChain name="rex">
    <processor class="com.basistech.rosette.solr.EntityExtractorUpdateProcessorFactory">
        <str name="rootDirectory">/opt/basistech/rex-je</str>
        <str name="fields">text_eng</str>
    </processor>
    <processor class="solr.LogUpdateProcessorFactory"/>
    <processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>

Note the reference to a field called text_eng. We will create this field through the Fusion UI in the next step.

Save your changes to solrconfig.xml.

Define Fields

The data file we will use, eng_docs.csv, contains two fields:

title. An article headline
article_text. The text content of the article

We will index these two static fields and also define a set of dynamic fields to hold the extracted entities. To create new fields, navigate to Collections > Fields and click Add a Field. Create the following fields:

Field name	Field type	Other options
`title`	`string`	Use default options.
`text_eng`	`text_en`	Use default options.
`text_eng_REX_*`	`string`	Create this field as a dynamic field by clicking the Dynamic checkbox. Click the Multivalued checkbox. Leave other options as defaults.

Be sure to save each field after creating it.

Indexing Data

Create Indexing Pipeline

Navigate to Indexing > Indexing Pipelines.
Click Add and create a new pipeline called test-entities.
Select the Field Mapping stage.
In the Field Translations section, add a new row with source article_text and target text_eng. Set the Operation to move.

The CSV file we will upload contains a field title, but since this matches the title field that we created earlier, there is no need to map it. Leave the Unmapped Fields section with its default option, keep.

Select the Solr Indexer stage.
In the Additional Update Request Parameters section, add a new row with parameter name update.chain and value rex.
Save the new pipeline.

Create Datasource

In this step, we will upload and index our documents from the data file.

Navigate to Indexing > Datasources.
Click Add and select File Upload V2 from the dropdown menu.
Enter eng_docs for the Datasource ID. Alternatively, use a name you prefer.
Select test-entities for the Pipeline ID.
In the File Upload field, choose the sample file eng_docs.csv and click Upload File. The File ID field will be automatically populated. Leave all other values as their defaults.
Save the new datasource. The form will refresh, adding a set of buttons at the top.
Click Run and then Start. When the job is finished, you will see “Success” in the popup form.

Querying Data

Navigate to Querying > Query Workbench. The default query is \*:*, which should bring up three documents.
For the document with title “SpaceX Successfully Launches its First Crewed Spaceflight”, select Show fields. You will see a number of entities listed under the text_eng_REX_* fieldnames.
Search on these multivalued fields. For example, set your query to text_eng_REX_LOCATION:"New York" to return the article that contains a mention of New York.

Customization (Advanced)

When setting up the Solr configuration, you specified the rootDirectory and fields options in your processor chain. REX provides a number of other configuration options you can set to control how entities are extracted. For example, if you are finding false positives, you can set parameters instructing REX to return only entities above a confidence threshold. The confidence threshold is a value between 0 and 1 and applies to entities extracted by the statistical model. We recommend starting with a low value, around 0.2. In your solrconfig.xml file, add the options calculateConfidence and confidenceThreshold to your processor chain definition:

<updateRequestProcessorChain name="rex">
    <processor class="com.basistech.rosette.solr.EntityExtractorUpdateProcessorFactory">
        <str name="rootDirectory">/opt/basistech/rex-je</str>
        <str name="fields">text_eng</str>
        <str name="calculateConfidence">True</str>
        <str name="confidenceThreshold">0.2</str>
    </processor>
    <processor class="solr.LogUpdateProcessorFactory"/>
    <processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>

Save the changes, re-index your data, and perform the same query on \*:*. Note that for the SpaceX article, “Falcon”, is now correctly omitted from the list of LOCATIONs.

If there is a particular entity you want to make sure is extracted or rejected, or if you wish to create a custom entity type, REX also supports gazetteers and regular expressions.

Gazetteers

A gazetteer is a UTF-8 text file in which the first line is the entity type. It is followed by the names of entities you wish to extract, separated by newlines, in the language of your documents. Comments can be prefixed by the # symbol. Create a file spacecraft_gaz.txt with the following lines:

SPACECRAFT
ISS
International Space Station
Vostok
Soyuz
Dragon
Crew Dragon
Cargo Dragon
Voyager
Apollo

Regular expressions

REX uses the Tcl regex format. Create a file, zulu_time_regex.xml, file with the following lines:

<regexps>
    <regexp lang="eng" type="ZULU_TIME">(?i)\m(?:[01]?\d|2[0-4])(?:[0-5]\d) (?:UTC|GMT)</regexp>
</regexps>

This regular expression will extract as entity type ZULU_TIME all spans that consist of a 4-digit military time unit followed by the time zone designator UTC or GMT.

Example

To instruct REX to use the gazetteer and regex file, edit your solrconfig.xml file. The addGazetteers option takes four parameters:

language
file
accept (True) or reject (False)
case-sensitive (True or False)

For example, with <str name="addGazetteers">eng,/path/to/spacecraft_gaz.txt,True,True</str>:

language	file	accept	case-sensitive
`eng`	`/path/to/spacecraft_gaz.txt`	`True`	`True`

The addRegularExpressions option takes two parameters:

file
accept (True) or reject (False)

For example, with <str name="addRegularExpressions">/path/to/zulu_time_regex.xml,True</str>:

file	accept
`/path/to/zulu_time_regex.xml`	`True`

The result in the solrconfig.xml file:

<updateRequestProcessorChain name="rex">
    <processor class="com.basistech.rosette.solr.EntityExtractorUpdateProcessorFactory">
        <str name="rootDirectory">/opt/basistech/rex-je</str>
        <str name="fields">text_eng</str>
        <str name="calculateConfidence">True</str>
        <str name="confidenceThreshold">0.2</str>
        <str name="addGazetteers">eng,/path/to/spacecraft_gaz.txt,True,True</str>
        <str name="addRegularExpressions">/path/to/zulu_time_regex.xml,True</str>
    </processor>
    <processor class="solr.LogUpdateProcessorFactory"/>
    <processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>

Now, when you re-index your data and search \*:*, the SpaceX document will have new entities listed in the text_eng_REX_SPACECRAFT and text_eng_REX_ZULU_TIME dynamic fields.

Additional Fusion deployment configurations are needed to use the REX gazetteer and regex options.

Using Multilingual Search (RBL)

RBL provides a set of linguistic tools to prepare your data for analysis. Language-specific models provide base forms (lemmas) of words, parts-of-speech tagging, compound components, normalized tokens, stems, and roots.In this tutorial, we will index and query headlines in English, Chinese, and German to demonstrate the linguistics capabilities of RBL: lemmatization, tokenization, and decompounding.

Create Application

To begin, create a new application called “multilingual”.

Configuration

Edit Solr Configuration

We will begin by adding the Basis library elements to the solrconfig.xml file.

Navigate to System > Solr Config to edit the solrconfig.xml file.
In the <lib/> directive section, add the following lines:
```
<lib dir="/opt/basistech/solr/lib/" regex=".*\.jar" />
```
For Fusion 4.x users, the dir path is the local REX installation path.
Save your changes to solrconfig.xml.

Edit Schema

Add a fieldType element for each language to be processed by the application. The fieldType element includes two analyzers: one for indexing documents and one for querying documents. Each analyzer contains a tokenizer and a token filter. The language attribute is set to the language code, equal to the ISO 639-3 code in most cases. The rootDirectory points to the RBL directory.

Navigate to System > Solr Config to edit the managed-schema.xml file.

In the fieldType section, add the following new field types: basis_english, basis_chinese, and basis_german.

<fieldtype class="solr.TextField" name="basis_english">
    <analyzer type="index">
        <tokenizer class="com.basistech.rosette.lucene.BaseLinguisticsTokenizerFactory" language="eng" rootDirectory="/opt/basistech/rbl-je"/>
        <filter class="com.basistech.rosette.lucene.BaseLinguisticsTokenFilterFactory" language="eng" rootDirectory="/opt/basistech/rbl-je"/>
    </analyzer>
    <analyzer type="query">
        <tokenizer class="com.basistech.rosette.lucene.BaseLinguisticsTokenizerFactory" language="eng" query="true" rootDirectory="/opt/basistech/rbl-je"/>
        <filter class="com.basistech.rosette.lucene.BaseLinguisticsTokenFilterFactory" language="eng" query="true" rootDirectory="/opt/basistech/rbl-je"/>
    </analyzer>
</fieldtype>

<fieldtype class="solr.TextField" name="basis_chinese">
    <analyzer type="index">
        <tokenizer class="com.basistech.rosette.lucene.BaseLinguisticsTokenizerFactory" language="zho" rootDirectory="/opt/basistech/rbl-je"/>
        <filter class="com.basistech.rosette.lucene.BaseLinguisticsTokenFilterFactory" language="zho" rootDirectory="/opt/basistech/rbl-je"/>
    </analyzer>
    <analyzer type="query">
        <tokenizer class="com.basistech.rosette.lucene.BaseLinguisticsTokenizerFactory" language="zho" query="true" rootDirectory="/opt/basistech/rbl-je"/>
        <filter class="com.basistech.rosette.lucene.BaseLinguisticsTokenFilterFactory" language="zho" query="true" rootDirectory="/opt/basistech/rbl-je"/>
    </analyzer>
</fieldtype>

<fieldtype class="solr.TextField" name="basis_german">
    <analyzer type="index">
        <tokenizer class="com.basistech.rosette.lucene.BaseLinguisticsTokenizerFactory" language="deu" rootDirectory="/opt/basistech/rbl-je"/>
        <filter class="com.basistech.rosette.lucene.BaseLinguisticsTokenFilterFactory" language="deu" rootDirectory="/opt/basistech/rbl-je"/>
    </analyzer>
    <analyzer type="query">
        <tokenizer class="com.basistech.rosette.lucene.BaseLinguisticsTokenizerFactory" language="deu" query="true" rootDirectory="/opt/basistech/rbl-je"/>
        <filter class="com.basistech.rosette.lucene.BaseLinguisticsTokenFilterFactory" language="deu" query="true" rootDirectory="/opt/basistech/rbl-je"/>
    </analyzer>
</fieldtype>

You can incorporate any additional Solr filters you need, such as the Solr lowercase filter. However, filters should be added into the chain after the Base Linguistics token filter. If you modify the token stream too significantly before RBL, you degrade its ability to analyze the text.

Save your changes to managed-schema.xml.

Define Fields

The data file we will use, multilingual_headlines.csv, contains fields for headlines in three languages: eng_headline, zho_headline, and deu_headline. The analysis chain requires a field definition with a type attribute that maps to the fieldType you defined in the schema.To create new fields, navigate to Collections > Fields and click Add a Field. Create the following fields:

Field name	Field type	Other options
`text_eng`	`basis_english`	Use default options.
`text_zho`	`basis_chinese`	Use default options.
`text_deu`	`basis_german`	Use default options.

Be sure to save each field after creating it.

Indexing Data

Create Indexing Pipeline

Navigate to Indexing > Indexing Pipelines.
Click Add and create a new pipeline called test-multilingual.
Select the Field Mapping stage.
In the Field Translations section, add three new rows:
Field name Field name Operation
eng_headline text_eng move
zho_headline text_zho move
deu_headline text_deu move
Save the new pipeline.

Field name	Field name	Operation
`eng_headline`	`text_eng`	`move`
`zho_headline`	`text_zho`	`move`
`deu_headline`	`text_deu`	`move`

Create Datasource

In this step, we will upload and index our documents from the data file.

Navigate to Indexing > Datasources.
Click Add and select File Upload from the dropdown menu.
Enter multilingual_headlines for the Datasource ID. Alternatively, use a name you prefer.
Select test-multilingual for the Pipeline ID.
In the File Upload field, choose the sample file multilingual_headlines.csv and click Upload File. The File ID field will be automatically populated. Leave all other values as their defaults.
Save the new datasource. The form will refresh, adding a set of buttons at the top.
Click Run and then Start. When the job is finished, you will see “Success” in the popup form.

Querying Data

Navigate to Querying > Query Workbench. The default query is \*:*, which should bring up ten documents.
Follow the examples in the subsections below to see how Fusion’s Advanced Linguistics capabilities can improve your search results.

Lemmatization

A “lemma” is the canonical form of a word, or the version of a word that you find in the dictionary. For example, the lemma of “mice” is “mouse”. The words “speaks”, “speaking”, “spoke”, and “spoken” all share the same lemma: “speak”.With RBL, you can perform searches by lemma, thus increasing your search results. This example demonstrates this practice with the words “knife” and “knives” below.

For ease of viewing results, select the Display Fields dropdown and enter text_eng in the Description field.
Enter the query text_eng:knife in the search box.

Two documents are returned. One of the headlines, “Iowa City Man Arrested After Alleged Altercation with a Knife”, is an exact match on the query term knife. With a standard Solr text field type, this would be the only result returned. However, the special type basis_english we configured allows the search engine to recognize “knives” as a form of “knife”. Therefore, the article “The Best Ways to Sharpen Kitchen Knives at Home” is also returned.RBL can significantly reduce your dependence on creating, maintaining, and using large synonym lists.

Tokenization

Tokenization is the process of separating a piece of text into smaller units called “tokens”. Tokens can be words, characters, or subwords, depending on how they are defined and analyzed. The RBL tokenizer first determines sentence boundaries, then segments each sentence into individual tokens. The most useful tokens are often words, though they may also be numbers or other characters.In some languages like Chinese and Japanese, word tokens are not separated by whitespace, and words can consist of one, two, or more characters. For example, the tokens in 我喜歡貓 (I like cats) are 我 (I), 喜歡 (like), and 貓 (cats). RBL uses statistical models to identify token boundaries, allowing for more accurate search results.

For ease of viewing results, select the Display Fields dropdown and enter text_zho in the Description field.
Enter the query text_zho:美國 (United States) in the search box.

The document “美國男子染疫無呼吸道症狀但難以說話行走” (“U.S. Man has no Respiratory Symptoms but has Difficulty Talking and Walking”) is returned, as 美國 (United States) is a token in the headline.With a standard Solr text field type, this headline would be naively tokenized with one character per token. Therefore, a search for 美 (beautiful) would trigger a false positive match, even though it is not a word in this context. However, with the advanced analytics we have configured here, the query text_zho:美 will correctly return zero results.

Compounds

RBL can decompose Chinese, Danish, Dutch, German, Hungarian, Japanese, Korean, Norwegian, and Swedish compounds, returning the lemmas of each of the components. The lemmas may differ from their surface form in the compound, such that the concatenation of the components is not the same as the original compound (or its lemma). Components are often connected by elements that are present only in the compound form. RBL allows Solr to index and query on these components, increasing recall of search results.

For ease of viewing results, select the Display Fields dropdown and enter text_deu in the Description field.
Enter the query text_deu:Land in the search box.

The result “Mitarbeiter sitzen in ihren Heimatländern fest” (“Employees are Stuck in Their Home Countries”) returned.This headline contains the compound “Heimatländern” (home countries). A search on Land (country) with a standard Solr text field type would not trigger a match. However, because RBL performs decompounding with lemmatization, searching on Heimat or Land will return a result.

Customization (Advanced)

When setting up the Solr configuration, you specified the language and rootDirectory options in your field type definition. This is sufficient for most use cases. However, RBL does provide more options to control the behavior of the tokenizer and analyzer. For example, the default tokenization does not consider URLs. As a result, https://lucidworks.com is tokenized as https, lucidworks, and com.If you wish to recognize URLs, you can add the option urls="true" to the tokenizer in your field type definition:

<fieldtype class="solr.TextField" name="basis_english">
    <analyzer type="index">
        <tokenizer class="com.basistech.rosette.lucene.BaseLinguisticsTokenizerFactory" language="eng" rootDirectory="/opt/basistech/rbl-je" urls="true"/>
        <filter class="com.basistech.rosette.lucene.BaseLinguisticsTokenFilterFactory" language="eng" rootDirectory="/opt/basistech/rbl-je"/>
    </analyzer>
    <analyzer type="query">
        <tokenizer class="com.basistech.rosette.lucene.BaseLinguisticsTokenizerFactory" language="eng" query="true" rootDirectory="/opt/basistech/rbl-je" urls="true"/>
        <filter class="com.basistech.rosette.lucene.BaseLinguisticsTokenFilterFactory" language="eng" query="true" rootDirectory="/opt/basistech/rbl-je"/>
    </analyzer>
</fieldtype>

This will instruct RBL to consider URLs as a single token, for example: https://lucidworks.com.To see a list of all options, consult the full RBL documentation.

New Features

Managed Fusion only: Added a new feature called Dynamic Pricing which improves scalability for custom pricing. This feature lets B2B organizations with large product and pricing inventories sort, facet, boost, and filter on custom prices and entitlements.

LucidAcademyLucidworks offers free training to help you get started.The Course for Dynamic Pricing focuses on how Dynamic Pricing maximizes custom pricing strategies:

Visit the LucidAcademy to see the full training catalog.

Managed Fusion only: Fusion now supports Reverse Search, which lets you set up monitoring queries that automatically include new documents. Instead of running a query multiple times to see if new documents have been added, this feature matches incoming documents to existing relevant queries, improving content awareness and productivity.
Tika Asynchronous Parsing improves document crawl speeds and prevents memory and stability issues during connector processes. You can Use Tika Asynchronous Parsing to separate document crawling from document parsing, which is useful for large sets of complex documents. For more information, see Asynchronous Tika Parsing.

Use Tika Asynchronous Parsing

This document describes how to set up your application to use Tika asynchronous parsing.Unlike synchronous Tika parsing, which uses a parser stage, asynchronous Tika parsing is configured in the datasource and index pipeline. For more information, see Asynchronous Tika Parsing.

Field names change with asynchronous Tika parsing.In contrast to synchronous parsing, asynchronous Tika parsing prepends parser_ to fields added to a document. System fields, which start with \_lw_, are not prepended with parser_. If you are migrating to asynchronous Tika parsing, and your search application configuration relies on specific field names, update your search application to use the new fields.

Configure the connectors datasource

Navigate to your datasource.
Enable the Advanced view.
Enable the Async Parsing option.
Fusion 5.9.11 and later uses your parser configuration when using asynchronous parsing.The asynchronous parsing service performs Tika parsing using Apache Tika Server. In Fusion 5.8 through 5.9.10, other parsers, such as HTML and JSON, are not supported by the asynchronous parsing service. By enabling asynchronous parsing, the parser configuration linked to your datasource is ignored. In Fusion 5.9.11 and later, other parsers, such as HTML and JSON, are supported by the asynchronous parsing service. By enabling asynchronous parsing, the parser configuration linked to your datasource is used.
Save the datasource configuration.

Configure the parser stage

You must do this step in Fusion 5.9.11 and later.

Navigate to Parsers.
Select the parser, or create a new parser.
From the Add a parser stage menu, select Apache Tika Container Parser.
(Optional) Enter a label for this stage. This label changes the names from Apache Tika Container Parser to the value you enter in this field.
If the Apache Tika Container Parser stage is not already the first stage, drag and drop the stage to the top of the stage list so it is the first stage that runs.

Configure the index pipeline

Go to the Index Pipeline screen.
Add the Solr Partial Update Indexer stage.
Turn off the Reject Update if Solr Document is not Present option and turn on the Process All Pipeline Doc Fields option:
Include an extra update field in the stage configuration using any update type and field name. In this example, an incremental field docs_counter_i with an increment value of 1 is added:
Enable the Allow reserved fields option:
Click Save.
Turn off or remove the Solr Indexer stage, and move the Solr Partial Update Indexer stage to be the last stage in the pipeline.

Asynchronous Tika parsing setup is now complete. Run the datasource indexing job and monitor the results.

The Apache Tika and Forked Tika stages are now deprecated. Follow the migration steps to begin using asynchronous parsing.

Improvements

Improved recoverability for on-prem connectors in high network traffic environments.
Fusion’s custom Solr image has been updated to fusion-solr 5.8.0. This upgrade includes the benefits and new features in Solr 9, while also including custom plugins to support Dynamic Pricing, Reverse Search, and autoscaling.
Developed new authentication methods for the MongoDB connector.

Bug Fixes

Fusion

Fixed a bug where the indexing service failed to load some classes from some JDBC drivers.
Updated the Helm charts used when deploying Prometheus, Grafana, Loki, and Promtail for monitoring.
Fixed an error with permissions required for the Upload Model Parameters To Cloud job.
Graph Security Trimming stage now works when collections have multiple shards and replicas.
Fixed a bug where having the same document updated twice in the same job could cause the job to hang.
Fixed an issue where the Solr API was unable to pass through raw requests using the proxy.
Updated the query pipeline and indexing container base images to use Java 11 so they are more secure.
Removed UI link to view logs dashboard as its target screen is no longer available.
Fixed a UI bug where zone display fields could not be manually removed.
Fusion panel text editors can now scroll as expected in Firefox.

Predictive Merchandiser

Fixed a bug in Predictive Merchandiser where templates having a higher precedence using a specified trigger phrase and facet were not appearing when that phrase was searched with that facet selected.

Deprecations

Field Parser Index Stage is no longer used by Fusion connectors. It is officially deprecated in this release and will be removed entirely in a later release.
Streaming documents to the /index and /reindex endpoints of the Index Pipelines APIis deprecated and will eventually stop working altogether in the continuing switch to asynchronous parsing.
Tika Server Parser is deprecated and will be replaced by Tika Asynchronous Parser.
Apache Tika Parser stage is deprecated and will be removed in a later release.
The Forked Apache Tika Parser stage is deprecated and will be removed completely in a later release.

Known issues

New Kerberos security realms cannot be configured successfully in this version of Fusion.
When using the JavaScript query stage to query Solr, you have to provide parameters including rows. Previously rows accepted an integer, but now it must be entered as a string as in ("rows", "1").

map.put("q", 'okta_uid_s:"{identifier}" OR username_s:"{identifier}" OR email_s:"{identifier}"'
    .replace(/{identifier}/ig, userIdentifier))
map.put("rows", "1")

Get Started

Introduction to Fusion

Getting Data In

Getting Data Out

Operations

Reference

Developer Docs

Neural Hybrid Search

Release Notes

​Using Named Entities (REX)

​Create Application

​Configuration

​Edit Solr Configuration

​Define Fields

​Indexing Data

​Create Indexing Pipeline

​Create Datasource

​Querying Data

​Customization (Advanced)

​Gazetteers

​Regular expressions

​Example

​Using Multilingual Search (RBL)

​Create Application

​Configuration

​Edit Solr Configuration

​Edit Schema

​Define Fields

​Indexing Data

​Create Indexing Pipeline

​Create Datasource

​Querying Data

​Lemmatization

​Tokenization

​Compounds

​Customization (Advanced)

​New Features

​Configure the connectors datasource

​Configure the parser stage

​Configure the index pipeline

​Improvements

​Bug Fixes

​Fusion

​Predictive Merchandiser

​Deprecations

​Known issues

Using Named Entities (REX)

Create Application

Configuration

Edit Solr Configuration

Define Fields

Indexing Data

Create Indexing Pipeline

Create Datasource

Querying Data

Customization (Advanced)

Gazetteers

Regular expressions

Example

Using Multilingual Search (RBL)

Create Application

Configuration

Edit Solr Configuration

Edit Schema

Define Fields

Indexing Data

Create Indexing Pipeline

Create Datasource

Querying Data

Lemmatization

Tokenization

Compounds

Customization (Advanced)

New Features

Configure the connectors datasource

Configure the parser stage

Configure the index pipeline

Improvements

Bug Fixes

Fusion

Predictive Merchandiser

Deprecations

Known issues