Troubleshoot failed datasource jobs

Table of Contents

Overview
Identify the cause
Adjust indexing settings
- Adjust retention parameters
- Adjust fetch settings

When indexing large files, or large quantities of files, you may encounter issues such as datasource jobs failing or documents not making it into Fusion.

Overview

When data flows into Fusion, it passes through a Kafka topic first. When the number of documents being created by a connector is large, or when the connector is pulling data into the Kafka topic faster than it can be indexed, the topic fills up and the datasource job fails. For example, if your connector is inputting a large CSV file where every row is imported as a separate Solr document, the indexing processing can time out before the document is fully ingested.

Identify the cause

If you experience failed datasource jobs or notice your connector isn’t grabbing all the documents it should, check the logs for the Kafka pod. Look for a message containing the phrases resetting offset and is out of range, which indicate data has been dropped.

2024-05-28T11:49:40.812Z - INFO  [pool-140-thread-3:org.apache.kafka.clients.consumer.internals.Fetcher@1413] - [Consumer clientId=example_Products-irdcsn, groupId=index-pipeline--example_Products--fusion.connectors.datasource-products_S3_Load] Fetch position FetchPosition{offset=6963199, offsetEpoch=Optional[0], currentLeader=LeaderAndEpoch{leader=Optional[fusion5-kafka-0.fusion5-kafka-headless.fusion5.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}} is out of range for partition fusion.connectors.datasource-products_S3_Load-2, resetting offset

Adjust indexing settings

If you determine that your datasource job is failing due to an issue in Kafka, there are a few options to try.

Adjust retention parameters

One solution is to increase the Kafka data retention parameters to allow for larger documents. You can configure these settings in your values.yaml file in the Helm chart.

The default value for kafka.logRetentionBytes is 1073741824 bytes (1 GB).

Try increasing this value to 2147483648 bytes (2 GB) or 3221225472 (3 GB), or larger depending on the size of your documents.

In Fusion 5.9.5, the default value is increased to 5 GB.

You can also set this to -1 to remove the size limit. If you do this, be sure to set an appropriate limit for logRetentionHours instead.
The default value for kafka.logRetentionHours is 168 (7 days).

If you increase kafka.logRetentionBytes by a significant amount (for example, 20 GB), you might need to decrease this setting to prevent running out of disk space. However, because older log entries are deleted when either limit is reached, you should set it high enough to ensure the data remains available until it’s no longer needed.
In Fusion, go to Indexing > Datasources and create a new datasource to trigger a new Kafka topic that incorporates these settings.

Adjust fetch settings

Another option is to decrease the values for number of fetch threads and request page size in your datasource settings.

In Fusion, go to Indexing > Datasources and click your datasource.
Click the Advanced slider to show more settings.
Reduce the number of Fetch Threads.
Reduce the Request Page Size.

This setting might not be available in every connector.