Regex Field Filter Index Stage
The Regex Field Filter Index Stage (called the Regular Expression Filter stage in versions earlier than 3.0) removes a field or fields from a PipelineDocument according to a set of filters where each filter specifies a field name and a regular expression. If a field value matches the regular expression, the field is deleted from the document. The regex patterns follow Java regular expression pattern rules.
Example Stage Specification
Create a regex-filter to find Social Security Numbers and drop them from documents:
{
"type" : "regex-field-replacement",
"id" : "ssnFilter",
"skip" : false,
"filters" : [ {
"sourceField" : "notes_t",
"pattern" : "^\\d{3}-\\d{2}-\\d{4}$"
} ]
}
Configuration
When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.
|