General V2 migration guide

Connectors are offered in two platforms: V2 and V1. All V1 connectors are deprecated and will be removed in a future Fusion release. Migrate to V2 to ensure your application is compatible with future Fusion upgrades.

Every Pro connector is a V2 connector, but not every V2 connector is a Pro connector.To learn more about connector platforms, see Platforms.

The following list groups the main considerations to address before starting a V1 to V2 connectors migration:

Connector behavior differences. V1 and V2 connectors are not behaviorally equivalent, and similar settings can produce different crawl or ingestion results.
Pipeline and mapping compatibility. V2 does not include the V1 field mapping stage, so any non-default V1 mappings must be recreated outside the connector. System fields, which are prefixed with _lw_, change between V1 and V2, so any pipeline logic that references these fields must be updated.
Validation and comparison strategy. Define success criteria before you migrate.
Migration setup and testing workflow. Use a sidecar or isolated target to support side by side comparison. Isolate connector output with a minimal pipeline first, then test with the full ingestion pipeline. Identify and fix gaps in non production environments before cutover.
Scale and tooling readiness. Migrations involving a large number of V1 connectors require automation or assisted analysis. Ensure connector specific migration guidance is ready for the connectors in scope.

In this guide, you will work through these considerations while applying steps and recommendations tailored to your specific connector type. For connector specific guidance, refer to the documentation for that connector. To locate your connector, see Find your connector.

Connector behavior differences

V2 connectors can behave differently than V1 connectors. V2 connectors also receive more frequent updates because V1 connectors are deprecated and planned for removal. Plan for both major and minor differences during migration. To identify differences, compare configuration options side by side. Use the connector configuration specification documentation, or export both configurations in JSON format from the API and compare them.

curl -u USERNAME:PASSWORD https://FUSION_HOST:FUSION_PORT/api/connectors/plugins

The order of fields in the JSON file may not match between V1 and V2 connectors. Use a tool like jq to sort the fields prior to analyzing:

CONFIG_FILE.json | jq -S .

Review the configurations and record any differences you find. Expect changes in field names, default values, and value formats. Defaults matter. If you relied on an implicit default in V1, you might need to set an explicit value in V2 to preserve behavior.

In the Fusion UI
For developers

JSON difference example

"fetchDelayMSPerHost": {
"type": "boolean",
"title": "Fetch delay per host",
"description": "If true, the 'Fetch delay (ms)' property will be applied for each host.",
"default": true, 
"default": false, 
"hints": ["advanced"]
}

Create a mapping document that tracks the default differences between V1 and V2. For each difference, analyze whether it impacts your application. Decide which defaults need to be changed to the V1 default and which can be left in the V2 default.

Field	V1 default	V2 default	Summary
`obeyRobots`	`true`	`false`	V1 respects robots.txt by default, while V2 ignores it. V2 will be more aggressive by default and will not respect robots.txt exclusions unless explicitly configured.
`fetchDelayMSPerHost`	`true`	`false`	V1 applies fetch delay per host by default, while V2 applies it globally. This affects how crawl throttling is distributed across different hosts during the crawl process.
`jsPageLoadTimeout`	`20000` ms (20 sec)	`15000` ms (15 sec)	V2 has a shorter page load timeout, making it 25% faster than V1. However, V2 may be less tolerant of slow-loading pages that require more time to fully render.
`jsScriptTimeout`	`20000` ms (20 sec)	`30000` ms (30 sec)	V2 has a 50% longer script execution timeout compared to V1. This makes V2 better suited for complex JavaScript applications and single-page applications (SPAs). V2 also includes enhanced documentation with specific timeout recommendations for different site types.
`requestCounterMinWaitMs`	`5000` ms (5 sec)	`500` ms (0.5 sec)	V2 waits 10 times less than V1 for AJAX requests to start. This makes crawls faster but may potentially miss late-starting AJAX calls that take longer to initiate.
`requestCounterMaxWaitMs`	`20000` ms (20 sec)	`3000` ms (3 sec)	V2 waits approximately 7 times less than V1 for AJAX completion. This significantly speeds up crawls but may potentially cut off slow AJAX operations that haven’t completed within the shorter timeframe.

This table highlights key differences between the connectors, but it is not comprehensive.

Pipeline and mapping compatibility

Identify what you must change to preserve behavior in V2. Focus on differences that affect how documents are enriched, routed, and indexed. Avoid implementing any totally new options you discover, unless you see them as a possible replacement for a removed option. For example, V2 connectors do not provide field mapping options in the data source configuration. Move field mappings to your index pipelines. V2 connectors may add the _lw_ prefix to certain fields that were unprefixed in V1 connectors. The _lw_ prefix identifies system fields managed by Fusion. If your search application or index pipeline depends on the unprefixed field names, use the following workaround:

Open your index pipeline.
Add a new Field Mapping stage.
In the Field Translations section, add a new field translation.
Set the source field to the system field. For example, _lw_SOURCE_FIELD.
Set the target field to the unprefixed name. For example, SOURCE_FIELD.
Select the Copy operation.

This copies the value from _lw_SOURCE_FIELD to SOURCE_FIELD while preserving the original system field.

Validation and comparison strategy

Define success criteria before you start the migration. Use criteria that are measurable and that reflect how your system is used. Suggested success criteria include the following:

The V2 configuration matches V1 on schedule, coverage, and operational performance.
The V2 configuration adopts improvements that matter for your use case, such as asynchronous parsing.
Data source and pipeline configurations produce the same document fields, values, and value formats.
Document counts match between V1 and V2, or you can explain and validate any count differences and confirm they are not caused by V2 configuration errors.
Frontend applications and queries work without changes.

Testing order and environment discipline

These steps describe the high-level workflow for setting up the migration, sequencing tests, and managing environment risk. Use them as an overview before you begin detailed migration work.

Set up an isolated target for comparison

Create a sidecar collection or equivalent isolated target so V2 ingestion does not overwrite V1 data, allowing side-by-side comparison of documents, fields, and formats throughout testing.

Create a new index profile for the sidecar collection

Create a new index profile that connects your datasource configuration to the new sidecar collection. The index profile defines which index pipeline and parser to use for processing documents from the V2 data source.

Run the V2 data source through a minimal pipeline first

Configure the V2 data source to use a basic vanilla pipeline that performs minimal processing and indexing.

Validate beyond document counts

Do not treat matching document counts as the primary success signal. Use counts as a secondary check, and investigate differences that may be caused by historical documents that were never deleted or by changes in connector behavior rather than by configuration errors.

Compare V1 and V2 outputs in the isolated target

Compare document content from V1 and V2 runs using JSON level inspection. Use a JSON diff approach that normalizes output, such as sorting keys, to identify meaningful differences in fields, values, and formats rather than differences caused by ordering.

Run the V2 data source through the full index pipeline

After validating connector output, rerun the V2 data source using the existing index pipeline. Use this run to uncover coupling issues, branching logic failures, and assumptions that only appear under full pipeline processing.

Validate query behavior and response format

Run representative queries against the V1 and V2 collections and compare responses. Confirm that returned fields, value formats, and response structure match expectations so applications and downstream integrations behave consistently after migration.

Plan and execute cutover with cleanup

When the V2 output is validated, cut over ingestion to the main collection. After cutover, identify any remaining V1 documents and remove them using an agreed retention rule, such as deleting documents older than a specific timestamp, to prevent stale data from skewing validation and search behavior.

Next, follow the remaining sections of this guide for detailed instructions you can apply while you work through these steps.

Set up an isolated target for comparison

Use a sidecar collection to index documents from the V2 connector data source, even in development environments. This prevents the V2 run from overwriting documents produced by the V1 connector and makes it easier to compare outputs during validation. Consider creating two sidecar collections. Use one collection for the V2 data source and a second collection for the V1 data source. This lets you run the V2 data source through a minimal pipeline and compare V1 and V2 outputs without overwriting the original V1 documents. To create a sidecar collection, follow these instructions:

In the Fusion UI
For developers

Navigate to Collections > Collections Manager.
Click New.
Name and save the collection.

curl 'https://FUSION_HOST:FUSION_PORT/api/apps/APP_NAME/collections?defaultFeatures=true' \
-X POST \
-u USERNAME:PASSWORD \
-H 'Content-Type: application/json' \
--data-raw '{"id":"COLLECTION_NAME_sidecar"}'

Create a new index profile for the sidecar collection

After creating the sidecar collection, you must create an index profile that connects your datasource to the collection. Create a separate index profile for the V2 datasource so you can point it to the sidecar collection without affecting the V1 datasource. To create an index profile, follow these instructions:

In the Fusion UI
For developers

Navigate to Indexing > Index Profiles.
Click New.
Select the sidecar collection as the target collection.
Name and save the index profile.

curl 'https://FUSION_HOST:FUSION_PORT/api/apps/APP_NAME/index-profiles' \
-X POST \
-u USERNAME:PASSWORD \
--data-raw '{
    "additionalProperties": {},
    "indexPipeline": "lucidworks-web",
    "parser": "lucidworks-web",
    "collection": "COLLECTION_NAME_sidecar",
    "id": "INDEX_PROFILE_v2_migration"
}'

Run the V2 data source through a minimal pipeline first

Use a minimal pipeline early in the migration to identify issues with the V2 configuration before you introduce the complexity of your full ingestion pipeline. This step helps you focus on connector output differences first. If you created a sidecar collection for the V1 connector, update the V1 data source to index into that sidecar collection for this step. If you did not create a V1 sidecar collection, clear the V1 data source output so you can reindex those documents as new documents. Update the V1 data source to use a minimal pipeline, then run an indexing job. Next, update the V2 data source to use the same minimal pipeline and index into its sidecar collection.

Validate beyond document counts

Document counts are a useful initial check. You can compare counts by navigating to Indexing > Collections and reviewing the document counts for your V1 and V2 data sources.

In the Fusion UI
For developers

Navigate to Indexing > Collections.
Locate the collection containing the V1 datasource documents.
Compare the count to the collection containing the V2 datasource documents.

curl 'https://FUSION_HOST:FUSION_PORT/api/apps/APP_NAME/collections/COLLECTION_NAME/stats' \
-u USERNAME:PASSWORD

If you see a significant difference, review your V2 connector configuration to identify the root cause. Common issues include the following:

The V1 connector datasource may not be removing documents from the collection as expected. If possible, clear the collection and reindex the documents to get an accurate count. Alternatively, create a sidecar collection, and index documents to get an accurate count.
The V2 connector datasource may have different URL filtering, depth limits, or inclusion/exclusion patterns. Review settings such as maximum crawl depth, URL pattern filters, and content type restrictions to ensure the V2 configuration matches the intended scope of the V1 connector.
The source content may have changed between when the V1 and V2 crawls ran. If the crawls occurred at different times, additions, deletions, or modifications to the source data can result in count differences. To eliminate this variable, run both crawls as close together as possible or snapshot the source content before testing.

Compare V1 and V2 outputs in the isolated target

Compare the V1 and V2 outputs and identify key differences. Export document details in JSON format so you can inspect fields, values, and value formats. For example:

curl 'https://FUSION_HOST:FUSION_PORT/api/apps/APP_NAME/query-pipelines/QUERY_PIPELINE/collections/COLLECTION_NAME/select?debug=timing&debug=query&debug=results&echoParams=all&fl=score,*&json.nl=arrarr&q=*:*&rows=10&start=0&wt=json' \
-u USERNAME:PASSWORD

Lucidworks recommends using a JSON comparison tool, such as JSON Diff, to help identify differences. Depending on the tool, you might need to sort JSON fields so the comparison maps fields consistently.

In the preceding example, the indexed_tdt field, which records when the document was indexed into the search system, is absent. Five new fields are added: _lw_container_id_s, _lw_source_id_s, _lw_parser_max_parsing_depth_s, _lw_block_start_time_l, and name_t. Importantly, the body text is not changed. Some differences may be acceptable. For example, your application may not make use of the indexed_tdt field, so its absence will not cause issues. Use your judgment to decide whether to continue to the next step or adjust the V2 data source and rerun this step. Before you move to the next step, confirm the following at a minimum. These validations create a progression: counts, field structure and types, content quality, and functional behavior.

Document counts are within an acceptable range of each other, accounting for known differences in crawl scope or timing.
Critical fields required by your application are present in the V2 output with correct data types, such as strings as strings, integers as integers, and multi-value fields as multi-value fields.
Body content and other key text fields contain the expected information without corruption or significant loss.
Basic facets and search filtering behavior matches between V1 and V2 collections.

Run the V2 data source through the full index pipeline

After you have a V2 configuration that produces comparable results to V1 with the minimal pipeline, clear the sidecar collections and update both data sources to use the full index pipeline. The full index pipeline can significantly transform document data. As a result, differences in the output can reveal additional migration work. For example, some V1 connectors include field mapping in the data source configuration. V2 connectors might not include this capability, so you must implement equivalent behavior with index pipeline stages. Repeat the same workflow you used with the minimal pipeline. Index documents from the V1 and V2 data sources into their respective collections, then inspect the resulting documents for differences. Record differences and likely causes. Some differences are caused by the V2 configuration. Update the V2 configuration to address those inconsistencies, then rerun ingestion. Other differences are caused by the index pipeline. Duplicate the index pipeline so you can test changes safely. Use the following instructions:

In the Fusion UI
For developers

Copying an index pipeline through the Fusion UI requires extensive manual work and increases the risk of mistakes. Lucidworks suggests using the “For developers” method to duplicate index pipelines.

Navigate to Indexing > Index Pipelines.
Click Add and provide a name for the new index pipeline.
Rebuild the original pipeline one stage at a time, matching each stage configuration precisely.
Save your work regularly to prevent data loss.

Retrieve the pipeline configuration in JSON format.

curl 'https://FUSION_HOST:FUSION_PORT/api/apps/APP_NAME/index-pipelines/PIPELINE_NAME' \
-u USERNAME:PASSWORD

Update the pipeline name by modifying the id value. Delete the secretSourcePipelineId field along with its value. For each stage, delete the id and secretSourceStageId fields along with their values.

{
"id" : "PIPELINE_NAME", 
"id" : "PIPELINE_NAME_v2", 
"stages" : [ {
    "id" : "c1401dbe-e019-4f70-bdac-ef146b9d1fb5", 
    "translationMappings" : [ ],
    "retentionMappings" : [ ],
    "updateMappings" : [ ],
    "unmappedRule" : {
    "keep" : true,
    "delete" : false,
    "fieldToMoveValuesTo" : "",
    "fieldToCopyValuesTo" : "",
    "valueToAddToUnmappedFields" : "",
    "valueToSetOnUnmappedFields" : ""
    },
    "reservedFieldsMappingAllowed" : false,
    "secretSourceStageId" : "c1401dbe-e019-4f70-bdac-ef146b9d1fb5", 
    "type" : "field-mapping",
    "skip" : false
}, {
    "id" : "890b582a-da36-4ca4-a650-987a05c6b08e", 
    "duplicateSingleValuedFields" : false,
    "fieldsToExclude" : [ ],
    "advancedTextFieldsIndexing" : true,
    "maxTextLengthForAdvancedIndexing" : 100,
    "secretSourceStageId" : "890b582a-da36-4ca4-a650-987a05c6b08e", 
    "type" : "solr-dynamic-field-name-mapping",
    "skip" : false
}, {
    "id" : "89b46dcb-06b7-4fbd-881d-8facc2221011", 
    "enforceSchema" : true,
    "fieldsList" : true,
    "dateFormats" : [ ],
    "params" : [ ],
    "bufferDocsForSolr" : true,
    "allowBlacklistedRequestParameters" : false,
    "secretSourceStageId" : "89b46dcb-06b7-4fbd-881d-8facc2221011", 
    "type" : "solr-index",
    "skip" : false
} ],
"properties" : {
    "secretSourcePipelineId" : "PIPELINE_NAME"
}
}

Submit a POST request to recreate the pipeline. The ID fields you deleted will be regenerated automatically.

curl 'https://FUSION_HOST:FUSION_PORT/api/apps/APP_NAME/index-pipelines' \
-X POST \
-u USERNAME:PASSWORD \
--data-raw '{
    "stages": [
    {
        "translationMappings": [],
        "retentionMappings": [],
        "updateMappings": [],
        "unmappedRule": {
        "keep": true,
        "delete": false,
        "fieldToMoveValuesTo": "",
        "fieldToCopyValuesTo": "",
        "valueToAddToUnmappedFields": "",
        "valueToSetOnUnmappedFields": ""
        },
        "reservedFieldsMappingAllowed": false,
        "type": "field-mapping",
        "skip": false
    },
    {
        "duplicateSingleValuedFields": false,
        "fieldsToExclude": [],
        "advancedTextFieldsIndexing": true,
        "maxTextLengthForAdvancedIndexing": 100,
        "type": "solr-dynamic-field-name-mapping",
        "skip": false
    },
    {
        "enforceSchema": true,
        "fieldsList": true,
        "dateFormats": [],
        "params": [],
        "bufferDocsForSolr": true,
        "allowBlacklistedRequestParameters": false,
        "type": "solr-index",
        "skip": false
    }
    ],
    "properties": {
    "js-engine": "Default Nashorn",
    "restrictiveMode": false
    },
    "id": "PIPELINE_NAME_v2"
}'

Update the duplicated pipeline to account for the differences. Repeat this process until the V1 and V2 outputs match across a representative set of documents.

Validate query behavior and response format

Validate query responses to avoid disrupting frontend applications and downstream integrations. Confirm that returned fields, value formats, and response structure match expectations so applications and integrations behave consistently after migration. To do this, create a new query profile so you can point it to the sidecar collection that contains documents from your V2 data source. Next, compare query results side by side in the query workbench. Alternate between the two query profiles and confirm that the response structure and returned fields look the same. Document order can differ because the V2 documents in the sidecar collection might not have the same scores. If possible, validate the response in your frontend as well.

Plan and execute cutover with cleanup

Complete this step in your development environment as the final proof point before you migrate to production. At this stage, you have performed robust validation and you are confirming that your configurations, adjustments, and findings hold when you run the V2 connector as the primary ingestion path. Update your development environment to fully replace the V1 connector with the V2 connector. Index into the main collection, then remove the V1 connector configuration so you are testing the same operating model you intend to use in production. After cutover, identify any remaining V1 documents in the main collection and plan their removal. Determine the appropriate timeline for cleanup based on your retention requirements and validation needs. Use an agreed retention rule, such as removing documents older than a specific timestamp, to prevent stale V1 data from affecting validation and search behavior.

Scale and tooling readiness

Migrations involving a large number of V1 connectors often require automation or assisted analysis. Before you scale migration work, build a clear understanding of your V1 connector inventory and how each connector is implemented. Some environments have only a few connectors to migrate, while others have many. After you complete the first batch of migrations, create a post-migration report. Use it to document the patterns you observed, the issues you encountered, and the mitigations that worked. This report becomes the basis for repeating the migration process reliably. With a solid understanding of your V1 implementations, you can often automate large parts of V2 configuration generation and run bulk testing. If you choose to streamline the workflow and skip parts of this guide, such as testing with a minimal index pipeline, validate that decision for each batch and confirm that results still meet your success criteria.

Moving to Production

Move to production only after you have completed the migration and validation work in development and you are confident the V2 connector behavior, pipeline behavior, and query responses meet your success criteria. At that point, plan how you will apply the same updates in production. If you take advantage of the ConfigSync feature offered by Lucidworks, you are able to promote validated configuration changes between environments. For more information, see Config Sync. You may need to apply changes directly in production by repeating the same migration workflow used in development. In either approach, follow the same validation sequence, plan the production cutover timing, and decide when and how to remove legacy V1 connector documents.

Concepts

Connectors

Developers

Downloads

Connector behavior differences

Pipeline and mapping compatibility

Validation and comparison strategy

Testing order and environment discipline

Set up an isolated target for comparison

Create a new index profile for the sidecar collection

Run the V2 data source through a minimal pipeline first

Validate beyond document counts

Compare V1 and V2 outputs in the isolated target

Run the V2 data source through the full index pipeline

Validate query behavior and response format

Plan and execute cutover with cleanup

Scale and tooling readiness

Moving to Production

Concepts

Connectors

Developers

Downloads

​Connector behavior differences

​Pipeline and mapping compatibility

​Validation and comparison strategy

​Testing order and environment discipline

​Set up an isolated target for comparison

​Create a new index profile for the sidecar collection

​Run the V2 data source through a minimal pipeline first

​Validate beyond document counts

​Compare V1 and V2 outputs in the isolated target

​Run the V2 data source through the full index pipeline

​Validate query behavior and response format

​Plan and execute cutover with cleanup

​Scale and tooling readiness

​Moving to Production

Connector behavior differences

Pipeline and mapping compatibility

Validation and comparison strategy

Testing order and environment discipline

Set up an isolated target for comparison

Create a new index profile for the sidecar collection

Run the V2 data source through a minimal pipeline first

Validate beyond document counts

Compare V1 and V2 outputs in the isolated target

Run the V2 data source through the full index pipeline

Validate query behavior and response format

Plan and execute cutover with cleanup

Scale and tooling readiness

Moving to Production