Time-Based Partitioning

A Fusion collection can be configured to map to multiple Solr collections, known as partitions in this context, where each partition contains data from a specific time range. An example is time-based partitioning for logs:

time based partitioning

Once a collection is configured for time-base partitioning, Fusion automatically ages out old partitions and creates new ones, using the configured partition sizes, expiration intervals, and so on. No manual maintenance is needed.

This feature is not enabled by default. Enable it for each collection using the Collection Features API.

Note
Fusion cannot retroactively partition data that has already been indexed. It can only perform time-based partitioning on incoming data.

Enabling time-based partitioning

  • In the UI, you can only enable time-based partitioning for new collections.

  • In the API, you can only enable time-based partitioning for existing collections.

Enablement using the Fusion UI

  1. While creating a collection, click Advanced.

  2. Scroll down to "Time Series Partitioning".

  3. Click Enable.

    collection new advanced

  4. Save the collection.

Currently, you cannot use the UI to enable time-based partitioning for an existing collection.

Enablement using the API

Use the Collection Features API to enable time-based partitioning for an existing collection.

Enable time-based partitioning using the default configuration:
curl -X PUT -H 'Content-type: application/json' -d '{"enabled": true}' http://localhost:8765/api/v1/collections/<collection>/features/partitionByTime

No response is returned.

Submit an empty request to the same endpoint to verify that time-based partitioning is enabled:

curl -X GET http://localhost:8765/api/v1/collections/<collection>/features/partitionByTime

Response:

{
  "name" : "partitionByTime",
  "collectionId" : "<collection>",
  "params" : { },
  "enabled" : true
}

To change the configuration, see the options and examples below.

Configuration options

When time series indexing is enabled for a collection, you can configure these options using the UI or the Collections API. None are required.

UI Label,
API Name
Description

Timestamp Field Name
timestampFieldName

The name of the field from which to read timestamps. The default is "timestamp".

Partition Time Period
timePeriod

The time range for each partition. The default is one day.

Max Active Partitions
maxActivePartitions

The number of partitions to keep active.

Delete Expired Partitions
deleteExpired

"True" to automatically delete partitions that fall outside of the maxActivePartitions window, at intervals of scheduleIntervalMinutes. The default is "false".

Preemptive Create Enabled
preemptiveCreateEnabled

"True" (the default) to create partitions in advance.

Schedule Interval
scheduleIntervalMinutes

The interval, in minutes, at which to perform background maintenance, including preemptively creating partitions (preemptiveCreateEnabled) and deleting expired partitions (deleteExpired). The default is five minutes.

Partition Num Shards
numShards

The number of shards per partition. The default is the value configured for the main Fusion collection.

Partition Replication Factor
replicationFactor

The number of copies to keep, per partition. The default is the value configured for the main Fusion collection.

Partition Config Name
configName

The name of the Solr configuration set to be applied to new partitions; the default is the configuration used by the primary collection.

Examples

Create a new collection called "TimeSeries1":
curl -X PUT -H 'Content-type: application/json' -d '{
  "solrParams": {
    "numShards": 1,
    "replicationFactor": 1
  }
}' http://localhost:8765/api/v1/collections/TimeSeries1
Enable and configure time-based partitioning for the "TimeSeries1" collection:
curl -X PUT -H 'Content-type: application/json' -d '{
  "enabled": true,
  "timestampFieldName": "ts",
  "timePeriod": "5MINUTES",
  "scheduleIntervalMinutes": 1,
  "preemptiveCreateEnabled": false,
  "maxActivePartitions": 4,
  "deleteExpired": true
}' http://localhost:8765/api/v1/collections/TimeSeries1/features/partitionByTime
Verify that time-based partitioning is enabled:
curl -X GET http://localhost:8765/api/v1/collections/TimeSeries1/features/partitionByTime
Import some sample data into this collection:
curl -X POST -H "Content-type:application/vnd.lucidworks-document" -d '[
  {
    "id": "1",
    "fields": [
      {
        "name": "ts",
        "value": "2016-02-24T00:00:01Z"
      },
      {
        "name": "partition_s",
        "value": "eventsim_2016_02_24_00_00"
      }
    ]
  },
  {
    "id": "2",
    "fields": [
      {
        "name": "ts",
        "value": "2016-02-24T00:05:01Z"
      },
      {
        "name": "partition_s",
        "value": "eventsim_2016_02_24_00_05"
      }
    ]
  },
  {
    "id": "3",
    "fields": [
      {
        "name": "ts",
        "value": "2016-02-24T00:10:01Z"
      },
      {
        "name": "partition_s",
        "value": "eventsim_2016_02_24_00_10"
      }
    ]
  },
  {
    "id": "4",
    "fields": [
      {
        "name": "ts",
        "value": "2016-02-24T00:15:01Z"
      },
      {
        "name": "partition_s",
        "value": "eventsim_2016_02_24_00_15"
      }
    ]
  },
  {
    "id": "5",
    "fields": [
      {
        "name": "ts",
        "value": "2016-02-24T00:20:01Z"
      },
      {
        "name": "partition_s",
        "value": "eventsim_2016_02_24_00_20"
      }
    ]
  }
]' http://localhost:8765/api/v1/index-pipelines/TimeSeries1-default/collections/TimeSeries1/index
View the Solr configuration for this collection:
curl -X GET "http://localhost:8765/api/v1/query-pipelines/TimeSeries1-default/collections/TimeSeries1/select?q=*:*"

The response includes a list of active Solr collections that correspond to this Fusion collection:

<str name="collection">TimeSeries1_2016_02_24_00_05,TimeSeries1_2016_02_24_00_10,TimeSeries1_2016_02_24_00_15,TimeSeries1_2016_02_24_00_20</str>