How To
Documentation
    Learn More

      Field Processors

      Field response processors include ones to perform these operations:

      • Capitalise the display value of the given field names.

      • Add Highlighting to Fields.

      • Tag a document with classifications based on field values.

      • Format a date object, replacing the original date value with another.

      • Parse dates out of field values.

      • Set the value of a field that is missing a value, based on the value of a different field.

      • Parse the String value of the specified fields into an Object representation.

      • Create a multivalued field from a single field value by using a separator.

      • Extract the hostname from URLs and place it in a field named 'site'.

      • Process fully qualified URLs in field values and markup, and add anchor tags for active links in the display values.

      • Statically add metadata to documents that match a given regular expression.

      • Replace field values (actual, display, or both) that are HTML or URL encoded with decoded values.

      • Replace field values (actual, display, or both) that match a given regular expression with a different value.

      • Make Twitter users and hashtags clickable in the display value.

      • Duplicate a field, creating two separate instances.

      • Create a new field by joining multiple existing fields using a pattern expression.

      • Localise the values of a field using a specified bundle.

      twigkit.search.processors.response.CapitaliseFieldValuesProcessor

      Capitalise the display value of the given field names.

      name: twigkit.search.processors.response.CapitaliseFieldValuesProcessor
      fields: firstName,lastName

      fields (java.lang.String)
      Comma-separated list of fields that should be affected by this processor.

      twigkit.search.processors.response.HighlightFieldValuesProcessor

      Add Highlighting to Fields. For a more detailed overview, see the highlighting page.

      twigkit.search.processors.response.FieldEntityExtractor

      Tag a document with classifications based on field values. Using the specified fields, look for patterns provided in a properties file and add classifications to a given field if the value matches.

      name: twigkit.search.processors.response.FieldEntityExtractor
      fields: issues
      classificationField: categorisedIssues
      bundle: my-issues

      With:

      #my-issues_en.properties in /resources/conf
      Foo
      Bar
      Bam

      This can be used to pull out keywords from a given field into a new field for display. If the Example above had a field named 'issues' with 'The problem is Foo something else' as value, then the FieldEntityExtractor would match on Foo, and stick 'Foo' into a new categorisedIssues field. This can be used to create new metadata fields, or even filter fields with sensitive data, to just pull out what you want to show.

      bundle (java.lang.String)
      A properties file containing entities to extract and an optional replacement value when found (for example, IBM = International Business Machines where the latter would be stored as the match for IBM). These are expressed as regular expressions (regex) so as well as simple matching based on whether the value was found within a field, entities can be recognised based on a specific pattern within the text of the chosen fields.

      fields (java.lang.String)
      Comma-separated list of fields that should be used to search for matches in order to classify the document.

      classificationField (java.lang.String)
      The field in which to store the classification values. The flexible pattern based approach used allows a document to be tagged with multiple classifications if several matches are found.

      twigkit.search.processors.response.DatePartExtractor

      Format a date object, replacing the original date value with another. Use on fields that are already date objects.

      name: twigkit.search.processors.response.DatePartExtractor
      fields: issues
      pattern: dd MMM yyyy

      fields (java.lang.String)
      Comma-separated list of fields that should be affected by this processor.

      pattern (java.lang.String)
      Format a Date object according to this Date pattern. Use Java’s SimpleDateFormat syntax. In the examples, for example, you will see that pattern="EEE, MMM d, ''yy" results in Wed, Jul 4, '01.

      twigkit.search.processors.response.FieldDateParser

      Parse dates out of field values. To convert String data into Date objects.

      name: twigkit.search.processors.response.FieldDateParser
      fields: issues
      pattern: dd MMM yyyy

      fields (java.lang.String)
      Comma-separated list of fields that should be affected by this processor.

      pattern (java.lang.String)
      The pattern to use when parsing the String value to a Date.

      twigkit.search.processors.response.FallbackFieldValue

      Set the value of a field that is missing a value, based on the value of a different field.

      name: twigkit.search.processors.response.FallbackFieldValue
      field: phoneNumber
      fallback: mainPhoneNumber
      pattern:
      values: display
      decode: false

      field (java.lang.String)
      Field that should be affected by this processor.

      fallback (java.lang.String)
      Field to use for the fallback values.

      pattern (java.lang.String)
      Regex pattern to use to extract a value from the fallback field

      values (java.lang.String)
      Which forms of the value to check for emptiness - 'display', 'actual', or 'either'

      decode (java.lang.Boolean)
      Whether to perform URL decoding on the fallback values

      twigkit.search.processors.response.FieldValueParser

      Parse the String value of the specified fields into an Object representation.

      name: twigkit.search.processors.response.FieldValueParser
      fields: amount

      fields (java.lang.String)
      Comma-separated list of fields that should be affected by this processor.

      twigkit.search.processors.response.FieldToMultiValueFieldProcessor

      Create a multivalued field from a single field value by using a separator.

      name: twigkit.search.processors.response.FieldToMultiValueFieldProcessor
      field: cities
      separator: ,

      field (java.lang.String)
      The name of the field containing the value to be split.

      separator (java.lang.String)
      The separator to use when splitting the value.

      twigkit.search.processors.response.HostNameExtractor

      Extract the hostname from URLs and place it in a field named 'site'.

      name: twigkit.search.processors.response.HostNameExtractor

      fields (java.lang.String)
      Comma-separated list of fields that should be affected by this processor.

      twigkit.search.processors.response.LinkMarkupProcessor

      Process fully qualified URLs in field’s 'actual' value, and markup with anchor tags for active links in display value.

      name: twigkit.search.processors.response.LinkMarkupProcessor
      fields: url

      fields (java.lang.String)
      Comma-separated list of fields that should be affected by this processor.

      twigkit.search.processors.response.RegExFieldValueTaggerProcessor

      Statically add metadata to documents that match a given regular expression (in one or more fields).

      name: twigkit.search.processors.response.RegExFieldValueTaggerProcessor
      fields: path
      classificationField: type
      pattern: [\w]+

      The example above looks at a field named path for example, "foo/bar/bam", and breaks all words into a multivalued field named type with multiple values foo, bar and bam using a regex pattern.

      If you want to change data rather than just use it, use a ReplaceFieldValue processor.

      fields (java.lang.String)
      Comma-separated list of fields that should be affected by this processor.

      classificationField (java.lang.String)
      The field that should contain the metadata classificationValue if the pattern matches.

      pattern (java.lang.String)
      The pattern to match to the values in the fields defined with the fields parameter.

      twigkit.search.processors.response.DecodeFieldValueProcessor

      Replace field values (actual, display, or both) that are HTML or URL encoded with decoded values. A use case might be to replace the display value of a URL field. Example usage:

      name: twigkit.search.processors.response.DecodeFieldValueProcessor
      fields: url_display
      encoding: url
      values: display

      fields (java.lang.String)
      Comma-separated list of fields that should be affected by this processor.

      values (java.lang.String)
      Whether to replace 'actual', 'display' or 'both' values.
      Default: 'both'

      encoding (java.lang.String)
      The encoding of the value to be decoded; 'url' or 'html'.
      Default: 'url'

      twigkit.search.processors.response.ReplaceFieldValue

      Replace field values (actual, display, or both) that match a given regular expression with a different value. The replacement value can contain back-references to matches. A common use case for this is to use a CopyFieldProcessor first, then make changes. Example:

      name: twigkit.search.processors.response.ReplaceFieldValue
      fields: folder
      replace: ^(.*/).*$
      with: $1

      The example above strips off a file name from the end of a folder path to leave just the path using capture expressions and back-references regular expressions.

      fields (java.lang.String)
      Comma-separated list of fields that should be affected by this processor.

      replace (java.lang.String)
      The pattern to replace (can contain regular expressions).

      with (java.lang.String)
      The replacement (can contain backreferences to the regular expression pattern).

      values (java.lang.String)
      Whether to replace 'actual', 'display' or 'both' values. Default: 'both'

      ignoreCase (java.lang.Boolean)
      Whether to ignore case during pattern matching. Default: false

      twigkit.search.processors.response.TweetMarkupProcessor

      Make Twitter users and hashtags clickable in the display value.

      name: twigkit.search.processors.response.TweetMarkupProcessor
      fields: twitter_user

      fields (java.lang.String)
      Comma-separated list of fields that should be affected by this processor.

      twigkit.search.processors.response.CopyFieldProcessor

      Duplicate a field, creating two separate instances.

      name: twigkit.search.processors.response.CopyFieldProcessor
      from: url
      to: my_url

      from (java.lang.String)
      Name of field to copy (clone).

      to (java.lang.String)
      Name to assign to the new field.

      twigkit.search.processors.response.ConcatenateFields

      Create a new field by joining multiple existing fields using a pattern expression.

      expression (java.lang.String)
      Concatenated field pattern defined using double curly braces (see below)

      target (java.lang.String)
      Name of the new field to create in the response

      name: twigkit.search.processors.response.ConcatenateFields
      expression: {{MemberStreet1}} {{MemberStreet2}} {{MemberCity}} {{MemberState}} {{MemberZipCode}} {{MemberCountry}}
      target: compositeAddress

      Or create a new image field that uses a field value as part of an expression:

      name: twigkit.search.processors.response.ConcatenateFields
      expression: http://your/custom/path/{{MemberCity}}.jpg
      target: image_url

      Then use a <media:image> tag to output the image field in the result output:

      <media:image field-name="image_url" width="156" height="156" ... >

      twigkit.search.processors.response.LocaliseFieldValueProcessor

      Localise the values of a field using a specified bundle. For example:

      name: twigkit.search.processors.response.LocaliseFieldValueProcessor
      bundle: languages
      locale: en
      fields: language

      As an example, add a file named languages_en.properties to your class path (for example, to src/main/resources) and containing these key-value pairs (truncated):

      aa = Afrikaans
      ab = Abkhaz
      am = Amharic
      ar = Arabic
      az = Azerbaijani
      ba = Bashkir
      be = Belarusian
      bg = Bulgarian
      bm = Bamanankan
      bn = Bengali
      bo = Tibetan
      br = Breton
      bs = Bosnian
      ca = Catalan
      co = Corsican
      cr = Cree
      cs = Czech
      cy = Welsh
      etc.

      fields (java.lang.String)
      Comma-separated list of fields that should be affected by this processor.

      values (java.lang.String)
      Whether to replace 'actual', 'display' or 'both' values. Default: both

      bundle (java.lang.String)
      The bundle to use.

      locale(java.lang.String)
      The locale to use.