Fusion supports uni-directional, multi-region replication of Solr updates between data centers using Solr’s CrossDC (Cross Datacenter) framework. This feature simplifies geo-redundancy and failover by providing a preconfigured Solr plugin (fusion-crossdc-producer) and a dedicated consumer application (fusion-crossdc-consumer) for replaying updates on the target cluster, helping to ensure high availability and business continuity in distributed or hybrid cloud environments.
This feature is supported in Fusion 5.9.13 and up.
With uni-directional CrossDC, Solr update requests (including indexing, collection, and configset changes) from a primary cluster are mirrored to a secondary cluster using Apache Kafka. You can configure which collections and actions are mirrored. source and target data centers
Be sure to review the Limitations section to understand what to expect before you enable this feature.

CrossDC and ConfigSync

Fusion supports ConfigSync in addition to CrossDC. While CrossDC is designed for data replication, ConfigSync is designed for configuration synchronization.
When you use CrossDC, ConfigSync must also be enabled in order to mirror Solr configset data that is modified by Fusion directly in ZooKeeper. CrossDC only mirrors the changes applied using Solr APIs.
Here’s a comparison of the two features:
FeatureCrossDCConfigSync
Data replication
Solr collection synchronization
Rules synchronization
Configuration synchronization
Blob synchronization
Version control (Git)
ZooKeeper data
Disaster recoveryFor search dataFor configuration only
Latency reductionAcross geo-distributed usersNot applicable
Typical use caseGlobal failover, data center redundancyDevOps config promotion, disaster recovery for Fusion config
For complete details about what ConfigSync manages, see Supported objects in the ConfigSync documentation. These are the objects managed by Solr CrossDC:
  • Solr collections, which can optionally include creating and deleting collections
  • Rules stored in *_query_rewriter and *_query_rewrite_staging collections
  • Any other new Solr collection data that you configure to be synchronized, as explained below

Before you begin

Before you enable Solr CrossDC, your Solr collections must already be in a synchronized state. After you enable CrossDC, synchronization happens automatically. The instructions below explain how to synchronize your collections before enabling this feature.

Configure CrossDC

CrossDC configuration is done in two data centers: the source and target. Follow the detailed steps below.
  1. In the source data center, configure Solr and Kafka.
  2. In the target data center, configure Solr, Kafka, and the Consumer.
  3. Configure Kafka MirrorMaker to connect to Kafka in both data centers.

Metrics and monitoring

Both fusion-crossdc-producer and fusion-crossdc-consumer expose metrics that can be monitored.

Producer metrics

The fusion-crossdc-producer module exposes the following metrics for each source replica in a collection, under the Solr /metrics API endpoint:
Metric nameDescription
crossdc.producer.localCounter representing the number of local documents processed successfully.
crossdc.producer.submittedCounter representing the number of documents submitted to the Kafka topic.
crossdc.producer.documentSizeHistogram of the processed document size.
crossdc.producer.errors.localCounter representing the number of local documents processed with error.
crossdc.producer.errors.submitCounter representing the number of documents that were not submitted to the Kafka topic because of exception during execution.
crossdc.producer.errors.documentTooLargeCounter representing the number of documents that were too large to send to the Kafka topic.

Consumer metrics

The fusion-crossdc-consumer application exposes the following metrics under its /metrics API endpoint, in JSON format with the following hierarchical keys, where the <TYPE> can be one of UPDATE, ADMIN, or CONFIGSET:

Counters

Metric nameDescription
counters.<TYPE>.inputNumber of input messages retrieved from Kafka
counters.<TYPE>.addNumber of input Add documents (one input message may contain multiple Add documents)
counters.<TYPE>.dbiNumber of input Delete-By-Id commands (one input message may contain multiple DBI commands)
counters.<TYPE>.dbqNumber of input Delete-By-Query commands (one input message may contain multiple DBQ commands)
counters.<TYPE>.collapsedNumber of input requests that were added to other requests to minimize the number of requests sent to Solr
counters.<TYPE>.handledTotal number of successfully processed output requests sent to Solr
counters.<TYPE>.failed-resubmitNumber of requests resubmitted to the input queue for re-trying (on intermittent failures)
counters.<TYPE>.failed-dlqNumber of requests submitted to the Dead-Letter queue due to failures on multiple re-tries
counters.<TYPE>.failed-no-retryNumber of requests dropped due to persistent failures (including inability to send to DLQ)
counters.<TYPE>.output-errorsNumber of errors when sending requests to target Solr
counters.<TYPE>.backoffNumber of times when the consumer had to back off from processing due to errors
counters.<TYPE>.invalid-collectionNumber of requests sent to an invalid (e.g. non-existent) collection

Timers

Metric nameDescription
timers.<TYPE>.outputLatencyDropwizard Timer (meter + histogram) for latency between request creation timestamp and the output timestamp. This assumes that the clocks are synchronized between the Producer and Consumer.
timers.<TYPE>.outputTimeDropwizard Timer for time to send the processed request to the target Solr.
The Consumer application also exposes a /threads API endpoint that returns a plain-text thread dump of the JVM running the Consumer application.

Limitations

The CrossDC feature has some known limitations:
  • Only updates are mirrored, not existing indexes. You should create a copy of each existing collection, with its documents and configset, on the target Solr before you turn on CrossDC.
  • Data loss can lead to divergence. If any of the components in your CrossDC configuration experience an event that causes data loss, the source collections and target collections can potentially diverge. Diverged indexes are not automatically detected or re-synchronized.
  • Document size is limited. The CrossDC Producer module tries to estimate the size of each message and avoid sending messages that are too large to Kafka. It does not split messages that are too large; instead, it rejects them. If this happens after the update has already been processed locally, then the contents of the mirrored collections can diverge. Kafka’s maximum message size is 1MB by default, configured with message.max.bytes.
  • Retries can lead to divergence The CrossDC Producer module first applies updates locally, and attempts mirroring only if they succeed. If sending a mirrored request fails, the request is retried, and if it’s still failing then it’s logged and the message is discarded (or sent to a dead-letter queue). Since the update was already applied locally, this can cause divergence of the local and mirrored collections.
  • Commands can be re-ordered when collapsed. The CrossDC Producer module can optionally preserve exact ordering of updates and deletes sent in a single request, but this negatively affects performance. If you do not need strict ordering of multiple commands in a single request, then you should use collapseUpdates=partial or collapseUpdates=all.
  • Delete-by-Query expansion can lead to divergence. The CrossDC Producer can either mirror Delete-By-Query requests as-is or expand them into individual Delete-By-Id requests (except for *:* which is always sent as-is). In extreme cases this expansion may produce a request that is too large to be mirrored. Delete-By-Query expansion helps to ensure the strict ordering of deletes and updates in the target Solr collection but it may also lead to divergence of the local and mirrored collections if the expansion fails or the resulting request is too large to mirror.
  • Collection creation and deletion requires an existing configset. The CrossDC Producer module can optionally mirror collection creation and deletion requests. However, the target Solr instance must already have the corresponding configset available in ZooKeeper. If it doesn’t, this causes an error when the target collection is created or deleted. The Consumer application may also experience significant slow-downs when it receives update requests to non-existent target collections. These slow-downs affect processing requests for other collections, too.
  • ConfigSet creation and deletion behavior depends on the handler. If you are using MirroringConfigSetsHandler, then new configsets created on the source Solr are mirrored automatically to the target Solr. If you are not using MirroringConfigSetsHandler, then new configsets are not mirrored; you must use ConfigSync or create them manually on the target Solr to avoid an error when the target collection is created or deleted. These errors also impact the performance of the Consumer application.
  • In some cases, admin request whitelisting is needed. If you are using MirroringCollectionsHandler, then all collection admin requests are mirrored. This may not always be desirable if the target Solr cluster is expected to differ or is managed externally (such as by an autoscaling operator). In this case, you should use FusionCollectionsHandler instead, and configure the collectionActionsWhitelist property to restrict the mirrored collection admin requests to only those that are needed.