Skip to main content

Documentation Index

Fetch the complete documentation index at: https://doc.lucidworks.com/llms.txt

Use this file to discover all available pages before exploring further.

This migration guide provides connector-specific tips to consider when migrating from the FTP V1 connector to the FTP Pro connector. Use this migration guide alongside the general migration guide. The general guide contains migration instructions that apply to all connectors, such as creating an isolated collection for testing and comparing search results for the V1 and Pro connectors.

Prerequisites

To ensure a successful migration of your datasource data from the FTP V1 connector to the FTP Pro connector, verify that you have the following prerequisites:
  • Access credentials for your FTP server
  • Fusion 5.9.0 or later
If you are using a Fusion API in any step of the migration, note that the formatting of many property names has changed in the Pro connector. Key changes include:
  • max_bytes is now maxSizeBytes
  • max_docs is now maxItems
  • url has been split into server and paths properties
Consult the reference pages for the FTP V1 and FTP Pro connectors for all property names.

Make a plan for removed fields in the Pro connector

The FTP Pro connector does not reproduce the field mapping settings from the V1 connector. In the V1 connector, these settings provided mapping of fields before the documents were sent to an index pipeline. These fields display in the Field Mapping section when the advanced settings are displayed. These fields and settings must be manually recreated as stages in your Fusion index pipeline. The following table displays the relevant fields in the V1 connector.
V1 Field BlockMigration Action Required
Initial Field MappingMove all rules to a Field Mapping Stage in your Index Pipeline.
Field RetentionUse a Restrict Fields or Field Mapping stage to delete fields.
Field Value UpdatesUse a Set Property stage in the pipeline.
Field TranslationsUse a Field Mapping stage to copy/move fields.
Unmapped FieldsRe-implement using a Field Mapping “Default” behavior.
The Crawl Bounds field has been removed in the FTP Pro connector. This field allowed limiting crawls to a specific directory sub-tree, hostname, or domain. To replicate this behavior in the Pro connector, use the Inclusive regexes and Exclusive regexes fields to control which paths are crawled. Two other fields have been removed from the Pro connector that require no action from you. The Solr Commit on Finish field has been removed from the FTP Pro connector. The Validate access field is built into the FTP Pro connector as a core feature. No action is required for configuring these settings during the migration process.

New fields

There are several new fields available in the FTP Pro connector to improve security, reliability, and performance.

Enhanced authentication options

The Pro connector introduces new authentication methods, particularly for SFTP connections. If you are using an SFTP connection, edit these settings to make sure you are connecting securely to your SFTP server.
SSH Public Key Authentication
object
Trusted SSH Host Key Fingerprint
string
SSH host key fingerprint to verify server identity for SFTP connections. Supports MD5 and SHA256 formats. If empty, verification is disabled. This is not recommended for production environments. Get your fingerprint in the command line with: ssh-keyscan hostname | ssh-keygen -E md5 -lf -

Additional document filtering

The Pro connector adds new filtering capabilities:
Excluded file extensions
array
A set of file extensions to be skipped from the fetch. This complements the existing Included file extensions field to provide both allow-list and deny-list capabilities.
Minimum File Size
integer
default:"0"
Specifies the minimum file size for content download (bytes). This field complements the existing Maximum File Size field. When set to a positive value, files below the minimum file size are indexed with metadata only. The default (0) means no minimum size.

Connection and retry settings

The FTP Pro connector automatically detects and retries the connection when transient errors happen (for example, 4xx connection errors or timeouts) and does not attempt a retry in the case of permanent errors (for example, 5xx connection errors or permission failures). New connection management fields provide better control over timeouts and error handling.
Connection Properties
object
Retry Properties
object

Changes to connection settings

The FTP Pro connector has restructured the server configuration settings. In the V1 connector, you provided a single Start link URL (for example, ftp://ftp.example.com:21/example_folder/), and the FTP V1 connector used the URL scheme to detect the connection type. The Pro connector separates the connection settings into discrete fields, and now you explicitly select the connection type.
Server Properties
object
required
Paths
array
List of directory or file paths to retrieve. If empty, the crawl starts from the root directory (/). In the V1 connector, you could only specify one starting path in the URL. The Pro connector allows multiple starting paths.
Migration example: V1 configuration:
url: ftp://ftp.example.com:21/documents/public/
Pro configuration:
server.protocols: FTP
server.host: ftp.example.com
server.port: 21
paths: ["/documents/public/"]

Fetch settings

Some fetch settings that were previously hardcoded into the V1 connector are now exposed and configurable in the Pro connector. These fetch settings provide configuration options at the system level to control fetch behavior and performance. Click the Advanced toggle to expose these settings. Then select the checkbox next to Core Properties, and select the checkbox next to Fetch Settings. To enable asynchronous parsing for the FTP Pro connector, the Async Parsing field is also available in the Fetch Settings section.
The default number of fetch threads has changed from 1 in the V1 connector to 5 in the Pro connector. This means the Pro connector creates five concurrent connections to your FTP server by default. Review your FTP server’s connection limits and adjust the Fetch Threads setting if necessary to avoid connection errors or rate limiting.

Changed default values

Some fields have different default values in the V1 and Pro connectors. You should evaluate the new default values, determine if these values make sense for your use case, and change the values as needed. The following table displays the changed default fields and their values in the Pro connector.
Field nameV1 DefaultPro DefaultNotes
Maximum File Sizemax_bytes: 10485760 (10MB)maxSizeBytes: 0 (unlimited)The Pro connector now attempts to download files of any size by default. Set an appropriate limit to prevent memory issues.
Fetch Threadsmax_threads: 1numFetchThreads: 5The Pro connector creates five times more concurrent connections by default. Verify your FTP server can handle this load, or adjust the default value in the Pro connector.
Maximum Level Depthcrawl_depth: -1 (unlimited)maxDepth: 0 (unlimited)Semantic change: -1 vs 0 to indicate no limit.
The change from a 10MB default file size limit to unlimited is a critical change. Without setting an explicit limit in the Pro connector, you may encounter memory issues when crawling servers that contain large files. Set the Maximum File Size field to an appropriate value for your use case.