Troubleshoot failed datasource jobs
When indexing large files, or large quantities of files, you may encounter issues such as datasource jobs failing or documents not making it into Fusion.
Overview
When data flows into Fusion, it passes through a Kafka topic first. When the number of documents being created by a connector is large, or when the connector is pulling data into the Kafka topic faster than it can be indexed, the topic fills up and the datasource job fails. For example, if your connector is inputting a large CSV file where every row is imported as a separate Solr document, the indexing processing can time out before the document is fully ingested.
Identify the cause
If you experience failed datasource jobs or notice your connector isn’t grabbing all the documents it should, check the logs for the Kafka pod.
Look for a message containing the phrases resetting offset
and is out of range
, which indicate data has been dropped.
2024-05-28T11:49:40.812Z - INFO [pool-140-thread-3:org.apache.kafka.clients.consumer.internals.Fetcher@1413] - [Consumer clientId=example_Products-irdcsn, groupId=index-pipeline--example_Products--fusion.connectors.datasource-products_S3_Load] Fetch position FetchPosition{offset=6963199, offsetEpoch=Optional[0], currentLeader=LeaderAndEpoch{leader=Optional[fusion5-kafka-0.fusion5-kafka-headless.fusion5.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}} is out of range for partition fusion.connectors.datasource-products_S3_Load-2, resetting offset
Adjust indexing settings
If you determine that your datasource job is failing due to an issue in Kafka, there are a few options to try.
Adjust retention parameters
One solution is to increase the Kafka data retention parameters to allow for larger documents.
You can configure these settings in your values.yaml
file in the Helm chart.
-
The default value for
kafka.logRetentionBytes
is1073741824
bytes (1 GB).Try increasing this value to
2147483648
bytes (2 GB) or3221225472
(3 GB), or larger depending on the size of your documents.In Fusion 5.9.5, the default value is increased to 5 GB. You can also set this to
-1
to remove the size limit. If you do this, be sure to set an appropriate limit forlogRetentionHours
instead. -
The default value for
kafka.logRetentionHours
is168
(7 days).If you increase
kafka.logRetentionBytes
by a significant amount (for example, 20 GB), you might need to decrease this setting to prevent running out of disk space. However, because older log entries are deleted when either limit is reached, you should set it high enough to ensure the data remains available until it’s no longer needed. -
In Fusion, go to Indexing > Datasources and create a new datasource to trigger a new Kafka topic that incorporates these settings.
Adjust fetch settings
Another option is to decrease the values for number of fetch threads and request page size in your datasource settings.
-
In Fusion, go to Indexing > Datasources and click your datasource.
-
Click the Advanced slider to show more settings.
-
Reduce the number of Fetch Threads.
-
Reduce the Request Page Size.
This setting might not be available in every connector.