Couchbase V1

Compatible with Fusion version: 4.0.0 through 4.2.6

Deprecation and removal noticeThis connector is deprecated as of Fusion 4.2 and is removed or expected to be removed as of Fusion 5.0. Due to a dependency issue, the Couchbase V1 connector does not work as expected in Fusion 4.2. Use the Couchbase V2 connector instead.For more information about deprecations and removals, including possible alternatives, see Deprecations and Removals.

This connector has been tested for compatibility with Couchbase Server 2.5.1 Enterprise Edition.

Learn more

Configure Field Mapping for Couchbase

The Couchbase V1 connector uses the Cross-Datacenter Replication (XDCR) feature of Couchbase to retrieve data stored in Couchbase continuously in real-time.The Couchbase connector has built-in field mapping which allows mapping Couchbase fields to fields in your schema. The mapping configuration defines a field from your schema and an XPath-style path to the field in the Couchbase JSON document.The field mapping can accept wildcards and double-wildcards to map fields automatically. Wildcards can be used, but only at the end of the path definition.

field_name="" and field_path=/docs/* - maps all the fields under docs to the same name as given in JSON.
field_name="" and field_path=/docs/** - maps all the fields under docs and their children fields to the same name as given in JSON.
field_name=searchField and field_path=/docs/* - maps all the fields under /docs to a single field named ‘searchField’.
field_name=searchField and field_path=/docs/** - maps all the fields under /docs and their children fields to a single field named ‘searchField’.

If mapping is not defined, a default mapping will be assigned, in the format of the second example above, i.e., field_name="" and field_path=/docs/**.This example shows some simple field mapping, using a single document such as this:

{
  "first": "John",
  "last": "Doe",
  "grade": 8,
  "exams": [
        {
        "subject": "Maths",
        "test"   : "term1",
        "marks": 90 },
        {
         "subject": "Biology",
         "test"   : "term1",
         "marks": 86 }
        ]
}

When we configure the datasource, we can define our field mapping as follows:

"field_mapping": [
{
    "field_name":"points_i",
    "field_path":"/exams/marks"
},
{
    "field_name":"",
    "field_path":"/**"
}
]

Two mappings are defined. The first will map the /exams/marks field from Couchbase to the points_i field in Solr. The second maps all top-level and child fields from Couchbase to either the same field name in Solr or to a dynamic field rule.After retrieving the document, it should look like this:

{
  "first_s": "John",
  "last_s": "Doe",
  "grade_i": 8,
  "exams": [
        {
        "subject_s": "Maths",
        "test_s"   : "term1",
        "points_i":90},
        {
         "subject_s": "Biology",
         "test_s"   : "term1",
         "points_i":86}
        ]
}

The marks field from the original document has been mapped to the points_i field; most of the other fields have been mapped to appropriate dynamic field rules.Note that the representation of the document above is after it has been retrieved from Couchbase, but before it has been processed by the index pipelines. Since the index pipelines contain several stage types that can further transform the document, such as the Apache Tika Parser stage and the Field Mapping stage, the document that ends up indexed to Solr may be different from the document representation above. Some small iterations of crawling are recommended to be sure the documents are indexed as required.

Split Couchbase Documents

The Couchbase V1 connector uses the Cross-Datacenter Replication (XDCR) feature of Couchbase to retrieve data stored in Couchbase continuously in real-time.Because Couchbase has a flexible data model, documents may have a nested JSON structure. It is possible to split nested documents with a splitpath property, which uses an XPath-style path to the element to split on. These paths do not accept wildcards.

These paths do not accept wildcards.

For example, if you have a document that looks like this:

{
  "first": "John",
  "last": "Doe",
  "grade": 8,
  "exams": [
        {
        "subject": "Maths",
        "test"   : "term1",
        "marks":90},
        {
         "subject": "Biology",
         "test"   : "term1",
         "marks":86}
        ]
}

If we want to split this document on the exams element and create two documents, each with a different subject, we would define "splitpath":"/exams" in our datasource definition.

When using the Fusion UI to configure the datasource, enter the path without quotes.

The output from retrieving the document should look like this:

{
  "first": "John",
  "last": "Doe",
  "grade": 8,
  "exams": [
        {
        "subject": "Maths",
        "test"   : "term1",
        "marks":90
        }
    ]
},
{
  "first": "John",
  "last": "Doe",
  "grade": 8,
  "exams": [
        {
         "subject": "Biology",
         "test"   : "term1",
         "marks":86
        }
        ]
}

Tune a Couchbase Datasource

The Couchbase V1 connector uses the Cross-Datacenter Replication (XDCR) feature of Couchbase to retrieve data stored in Couchbase continuously in real-time.Because the Couchbase connector retrieves data continuously, two properties are available to control the frequency of commits to Solr, which makes the documents available for user queries. The properties define the maximum number of documents to queue for a commit (set to 50,000 by default) and the maximum amount of time to wait between commits (set to 120 seconds, or 2 minutes). Documents will be committed when one of those thresholds is reached first, meaning that if 2 minutes have passed and there are only 20,000 documents, a commit will occur. Similarly, if only 1 minute has passed and there are 50,000 documents in the queue, a commit will occur. These properties can be adjusted for your own requirements if needed.

Configuration

When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.

Concepts

Connectors

Developers

Downloads

Learn more

Configuration

Concepts

Connectors

Developers

Downloads

​Learn more

​Configuration

Learn more

Configuration