Fusion Server

Version 4.1
How To
Documentation
    Learn More

      Include Documents Index Stage

      Table of Contents

      This stage passes documents to the next stage in the pipeline if they match one or more of the specified rules (Boolean OR). If some field has multiple values then at least one value must match against specified pattern. All non-matching documents are dropped. Rules are defined using regular expression field matching.

      Examples

      Give the "simple-include" pipeline a stage that includes only certain document types:
      curl -u user:pass -X POST -H "Content-type: application/json" 'http://localhost:8764/api/index-pipelines' -d '
      {
        "id" : "simple-include",
        "stages" : [ {
          "type" : "include-doc",
          "matchRules" : [ {
              "field" : "document_type",
              "pattern" : "(xls|xlsx|xlst|doc|docx)"
          }]
        }]
      }'

      Response:

      {
        "id" : "simple-include",
        "stages" : [ {
          "type" : "include-doc",
          "id" : "f701f96b-780e-4355-9dd3-6e53a89afe3e",
          "matchRules" : [ {
            "field" : "document_type",
            "pattern" : "(xls|xlsx|xlst|doc|docx)"
          } ],
          "type" : "include-doc",
          "skip" : false,
          "label" : "include-doc"
        } ],
        "properties" : { }
      }
      Send a text document through the "simple-include" pipeline:
      curl -u user:pass 'http://localhost:8764/api/index-pipelines/simple-include/collections/logs/index?simulate=true&echo=true' -H 'Content-type: application/json' -d '
      {
        "document_type": "txt"
      }'

      The empty response indicates the document was dropped:

      [ ]
      Send an XLS document through the pipeline:
      curl -u user:pass 'http://localhost:8764/api/index-pipelines/simple-include/collections/logs/index?simulate=true&echo=true' -H 'Content-type: application/json' -d '
      {
        "document_type": "xls"
      }'

      The response is document metadata, indicating the document passed the stage:

       {
        "id" : "9e7d1c2e-343a-49de-bc6a-1d1fc25fa93f",
        "fields" : [ {
          "name" : "document_type",
          "value" : "xls",
          "metadata" : { },
          "annotations" : [ ]
        } ]
      } ]

      Configuration

      When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.

      When using Fusion's REST API, the ID for this stage is:include-doc.

      Loading liquid template...

      Loading configuration schema...