Time-Based Partitioning
A Fusion collection can be configured to map to multiple Solr collections, known as partitions in this context, where each partition contains data from a specific time range. An example is time-based partitioning for logs:
Once a collection is configured for time-base partitioning, Fusion automatically ages out old partitions and creates new ones, using the configured partition sizes, expiration intervals, and so on. No manual maintenance is needed.
This feature is not enabled by default. Enable it for each collection using the Collection Features API.
Note
|
Fusion cannot retroactively partition data that has already been indexed. It can only perform time-based partitioning on incoming data. |
Enabling time-based partitioning
-
In the UI, you can only enable time-based partitioning for new collections.
-
In the API, you can only enable time-based partitioning for existing collections.
Enablement using the Fusion UI
-
Open the Collections Manager:
-
Click New.
NoteIn the UI, you can only enable time-based partitioning for new collections. To enable it for an existing collection, use the API. -
Click Advanced.
-
Scroll down to "Time Series Partitioning".
-
Click Enable.
When you enable this option, Fusion displays the time series partitioning configuration options.
-
Save the collection.
Enablement using the API
Use the Collection Features API to enable time-based partitioning for an existing collection.
curl -X PUT -H 'Content-type: application/json' -d '{"enabled": true}' http://localhost:8764/api/collections/<collection>/features/partitionByTime
No response is returned.
Submit an empty request to the same endpoint to verify that time-based partitioning is enabled:
curl -X GET http://localhost:8764/api/collections/<collection>/features/partitionByTime
Response:
{
"name" : "partitionByTime",
"collectionId" : "<collection>",
"params" : { },
"enabled" : true
}
To change the configuration, see the options and examples below.
Configuration options
When time series indexing is enabled for a collection, you can configure these options using the UI or the Collections API. None are required.
UI Label, API Name |
Description |
---|---|
Timestamp Field Name |
The name of the field from which to read timestamps. The default is "timestamp". |
Partition Time Period |
The time range for each partition. The default is one day. |
Max Active Partitions |
The number of partitions to keep active. |
Delete Expired Partitions |
"True" to automatically delete partitions that fall outside of the |
Preemptive Create Enabled |
"True" (the default) to create partitions in advance. |
Schedule Interval |
The interval, in minutes, at which to perform background maintenance, including preemptively creating partitions ( |
Partition Num Shards |
The number of shards per partition. The default is the value configured for the main Fusion collection. |
Partition Replication Factor |
The number of copies to keep, per partition. The default is the value configured for the main Fusion collection. |
Partition Config Name |
The name of the Solr configuration set to be applied to new partitions; the default is the configuration used by the primary collection. |
Examples
curl -X PUT -H 'Content-type: application/json' -d '{
"solrParams": {
"numShards": 1,
"replicationFactor": 1
}
}' http://localhost:8764/api/collections/TimeSeries1
curl -X PUT -H 'Content-type: application/json' -d '{
"enabled": true,
"timestampFieldName": "ts",
"timePeriod": "5MINUTES",
"scheduleIntervalMinutes": 1,
"preemptiveCreateEnabled": false,
"maxActivePartitions": 4,
"deleteExpired": true
}' http://localhost:8764/api/collections/TimeSeries1/features/partitionByTime
curl -X GET http://localhost:8764/api/collections/TimeSeries1/features/partitionByTime
curl -X POST -H "Content-type:application/vnd.lucidworks-document" -d '[
{
"id": "1",
"fields": [
{
"name": "ts",
"value": "2016-02-24T00:00:01Z"
},
{
"name": "partition_s",
"value": "eventsim_2016_02_24_00_00"
}
]
},
{
"id": "2",
"fields": [
{
"name": "ts",
"value": "2016-02-24T00:05:01Z"
},
{
"name": "partition_s",
"value": "eventsim_2016_02_24_00_05"
}
]
},
{
"id": "3",
"fields": [
{
"name": "ts",
"value": "2016-02-24T00:10:01Z"
},
{
"name": "partition_s",
"value": "eventsim_2016_02_24_00_10"
}
]
},
{
"id": "4",
"fields": [
{
"name": "ts",
"value": "2016-02-24T00:15:01Z"
},
{
"name": "partition_s",
"value": "eventsim_2016_02_24_00_15"
}
]
},
{
"id": "5",
"fields": [
{
"name": "ts",
"value": "2016-02-24T00:20:01Z"
},
{
"name": "partition_s",
"value": "eventsim_2016_02_24_00_20"
}
]
}
]' http://localhost:8764/api/index-pipelines/TimeSeries1-default/collections/TimeSeries1/index
curl -X GET "http://localhost:8764/api/query-pipelines/TimeSeries1-default/collections/TimeSeries1/select?q=*:*"
The response includes a list of active Solr collections that correspond to this Fusion collection:
<str name="collection">TimeSeries1_2016_02_24_00_05,TimeSeries1_2016_02_24_00_10,TimeSeries1_2016_02_24_00_15,TimeSeries1_2016_02_24_00_20</str>