Reporting API

The reporting API provides a set of reports for key metrics about user searches over a primary Fusion collection by running reports against the auxiliary searchLog and signals collections. Reports include:

  • 'topClicked' - documents which have received the most clicks.

  • 'topQueries' - queries ordered by frequency.

  • 'lessThanN' - queries which return less than N documents.

  • 'topN' - query terms ordered by frequency.

  • 'histo' - a histogram over all queries binned by length of query execution time.

  • 'dateHisto' - shows relative rate of queries over time.

The reports available for a particular collection depends on whether or not the auxiliary searchLogs and signals collection have been created. Collections created with the Fusion UI will have both searchLogs and signals by default. The 'topClicked' report requires both searchLogs and signals auxiliary collections. All other reports require searchLogs auxiliary collections.

Report Configuration Information

POST requests should always send a JSON object containing the Report configuration. If no special configuration is required, this is the empty object "{}".

The report configuration object contains the following attributes, depending on the report:

  • "n" value is a positive integer, used for both "topN" and "lessThanN" reports. E.g.: "n" : 1

  • "num" value is a positive integer, used for both "topClicked" report. E.g.: "num" : 5

  • "field" : required for report "topN", specifies the field where the search terms are stored. Default field is "q_txt". E.g.: "field" : "q_txt" See Search Query Reporting for details.

  • "rangeStart", "rangeEnd", "interval" : attributes used to restrict histogram report range and set bin size accordingly. E.g.: "rangeStart": 0, "rangeEnd": 1000, "interval" : 1000

  • "dateRangeStart", "dateRangeEnd", "timeInterval" : attributes required for date range histogram report. E.g.: "dateRangeStart": "NOW/DAY-1DAY", "dateRangeEnd": "NOW/DAY+1DAY", "timeInterval": "+1DAY"

Reports request syntax

Allowed requests are of the following form:

  • GET /api/apollo/reports/<collection_name>/ - lists reports availabe a Fusion primary collection specified by path parameter <collection_name>.

  • POST /api/apollo/reports/<collection_name>/<report_name>/ - runs a report for collection specified by path parameter <collection_name>. Report configuration is sent in the body of the POST request as JSON.

  • GET /api/apollo/reports/<collection_name>/<report_name>/<item_ID> - gets a detailed information for a specific item returned in a report. Used to drill down on specific queries or query terms.

Format of Report Results

All reports return a JSON list of objects or the empty list.

Contents of the object vary according to the report. The following attribute names are used across all reports:

  • "key" : matching query terms

  • "count" : raw count for this object

  • "percentage" : normalized count expressed as the proportion of the total items in the current report represented by this item, a real number between 0.0 and 1.0.

  • "token" or "item" : ID of token or item.

Examples

See reports available for collection "bb_catalog" which has searchLogs and signals enabled:

> curl -u user:pass http://localhost:8764/api/apollo/reports/bb_catalog

[ "histo", "topClicked", "topQueries", "lessThanN", "dateHisto", "topN" ]

See reports available for system collection "system_blobs" which doesn’t have any auxiliary collections:

> curl -u user:pass http://localhost:8764/api/apollo/reports/system_blobs

[ ]

Identify queries over collection "bb_catalog" for which no matching documents are found, i.e., queries which return less than 1 result:

> curl -u user:pass -X POST -H 'Content-type: application/json' -d @- \
> http://localhost:8764/api/apollo/reports/bb_catalog/lessThanN \
> <<EOF
> {"n":1}
> EOF

[ {
  "key" : "ipad",
  "count" : 3,
  "percentage" : 0.375,
  "token" : "eyJmaWx0ZXJzIjpbIm51bWRvY3NfbDpbKiBUTyAxXSIsInFfczppcGFkIl19"
}, {
  "key" : "id:2125233",
  "count" : 2,
  "percentage" : 0.25,
  "token" : "eyJmaWx0ZXJzIjpbIm51bWRvY3NfbDpbKiBUTyAxXSIsInFfczppZFxcOjIxMjUyMzMiXX0="
}, {
  "key" : "ipod",
  "count" : 1,
  "percentage" : 0.125,
  "token" : "eyJmaWx0ZXJzIjpbIm51bWRvY3NfbDpbKiBUTyAxXSIsInFfczppcG9kIl19"
}, {
  "key" : "typewriter",
  "count" : 1,
  "percentage" : 0.125,
  "token" : "eyJmaWx0ZXJzIjpbIm51bWRvY3NfbDpbKiBUTyAxXSIsInFfczp0eXBld3JpdGVyIl19"
}, {
  "key" : "unicorn",
  "count" : 1,
  "percentage" : 0.125,
  "token" : "eyJmaWx0ZXJzIjpbIm51bWRvY3NfbDpbKiBUTyAxXSIsInFfczp1bmljb3JuIl19"
} ]

Drill down on "lessThanN" report to examine information for "key" : "ipad" by token ID:

> curl -u user:pass \
> http://localhost:8764/api/apollo/reports/bb_catalog/lessThanN/eyJmaWx0ZXJzIjpbIm51bWRvY3NfbDpbKiBUTyAxXSIsInFfczppcGFkIl19

{
  "numFound" : 3,
  "start" : 0,
  "maxScore" : 0.0,
  "docs" : [ {
    "id" : "cdcdd42c-66f2-499e-a940-33d980596d36",
    "collection_s" : "bb_catalog",
    "q_txt" : [ "ipad" ],
    "q_s" : "ipad",
    "qtime_l" : 1,
    "totaltime_l" : 2,
    "numdocs_l" : 0,
    "timestamp_dt" : "2015-08-31T20:40:48.096Z",
    "httpmethod_s" : "POST",
    "req_q_ss" : [ "ipad" ],
    "req_debug_ss" : [ "true" ],
    "req_json.nl_ss" : [ "arrarr" ],
    "req_echoParams_ss" : [ "all" ],
    "req_lw.pipelineId_ss" : [ "bb_catalog-default" ],
    "req_fl_ss" : [ "*,score" ],
    "req_start_ss" : [ "0" ],
    "req_isFusionQuery_ss" : [ "true" ],
    "req_sort_ss" : [ "score desc" ],
    "req_rows_ss" : [ "10" ],
    "req_wt_ss" : [ "json" ],
    "_version_" : 1511054270105911296
  }, {
    "id" : "d4e22f4e-ae27-4662-82ed-17c68111f0d5",
    "collection_s" : "bb_catalog",
    "q_txt" : [ "ipad" ],
    "q_s" : "ipad",
    "qtime_l" : 3,
    "totaltime_l" : 4,
    "numdocs_l" : 0,
    "timestamp_dt" : "2015-09-01T13:45:28.008Z",
    "httpmethod_s" : "POST",
    "req_debug_ss" : [ "true" ],
    "req_json.nl_ss" : [ "arrarr" ],
    "req_echoParams_ss" : [ "all" ],
    "req_lw.pipelineId_ss" : [ "bb_catalog-default" ],
    "req_fl_ss" : [ "*,score" ],
    "req_start_ss" : [ "0" ],
    "req_isFusionQuery_ss" : [ "true" ],
    "req_rows_ss" : [ "10" ],
    "req_bq_ss" : [ "id:1945531^4.090439397841692", "id:2339322^1.5108471289277077", "id:1945595^1.0636971555650234", "id:1945674^0.40656840801239014", "id:2842056^0.33429211378097534", "id:2408224^0.43880610167980194", "id:2339386^0.39254774153232574", "id:2319133^0.32736557722091675", "id:9924603^0.19560790061950684", "id:1432551^0.18906432390213013" ],
    "req_q_ss" : [ "ipad" ],
    "req_defType_ss" : [ "edismax" ],
    "req_wt_ss" : [ "json" ],
    "req_facet_ss" : [ "true" ],
    "_version_" : 1511118736467165184
  }, {
    "id" : "a249d93c-9232-4ea7-a99a-fcf01b6c2c2f",
    "collection_s" : "bb_catalog",
    "q_txt" : [ "ipad" ],
    "q_s" : "ipad",
    "qtime_l" : 0,
    "totaltime_l" : 2,
    "numdocs_l" : 0,
    "timestamp_dt" : "2015-09-01T13:46:41.309Z",
    "httpmethod_s" : "POST",
    "req_q_ss" : [ "ipad" ],
    "req_debug_ss" : [ "true" ],
    "req_json.nl_ss" : [ "arrarr" ],
    "req_echoParams_ss" : [ "all" ],
    "req_lw.pipelineId_ss" : [ "bb_catalog-default" ],
    "req_fl_ss" : [ "*,score" ],
    "req_start_ss" : [ "0" ],
    "req_isFusionQuery_ss" : [ "true" ],
    "req_rows_ss" : [ "10" ],
    "req_wt_ss" : [ "json" ],
    "req_facet_ss" : [ "true" ],
    "req_bq_ss" : [ "id:1945531^4.090439397841692", "id:2339322^1.5108471289277077", "id:1945595^1.0636971555650234", "id:1945674^0.40656840801239014", "id:2842056^0.33429211378097534", "id:2408224^0.43880610167980194", "id:2339386^0.39254774153232574", "id:2319133^0.32736557722091675", "id:9924603^0.19560790061950684", "id:1432551^0.18906432390213013" ],
    "_version_" : 1511118813327785984
  } ]
}

Get all of the top queries, regardless of date, pass in empty date range specification:

> curl -u user:pass -X POST -H 'Content-type: application/json' -d '{}' \
> http://localhost:8764/api/apollo/reports/bb_catalog/topQueries

[ {
  "key" : "ipad",
  "count" : 42,
  "percentage" : 0.7118644,
  "token" : "eyJmaWx0ZXJzIjpbInFfczppcGFkIl19"
}, {
  "key" : "*:*",
  "count" : 10,
  "percentage" : 0.16949153,
  "token" : "eyJmaWx0ZXJzIjpbInFfczpcXCpcXDpcXCoiXX0="
}, {
  "key" : "id:2125233",
  "count" : 2,
  "percentage" : 0.033898305,
  "token" : "eyJmaWx0ZXJzIjpbInFfczppZFxcOjIxMjUyMzMiXX0="
}, {
  "key" : "typewriter",
  "count" : 2,
  "percentage" : 0.033898305,
  "token" : "eyJmaWx0ZXJzIjpbInFfczp0eXBld3JpdGVyIl19"
}, {
  "key" : "unicorn",
  "count" : 2,
  "percentage" : 0.033898305,
  "token" : "eyJmaWx0ZXJzIjpbInFfczp1bmljb3JuIl19"
}, {
  "key" : "ipod",
  "count" : 1,
  "percentage" : 0.016949153,
  "token" : "eyJmaWx0ZXJzIjpbInFfczppcG9kIl19"
} ]

_Drill down on topQueries report for item with "key" : "ipod", "token": "eyJmaWx0ZXJzIjpbInFfczppcG9kIl19"

> curl -u user:pass \
> http://localhost:8764/api/apollo/reports/bb_catalog/topQueries/eyJmaWx0ZXJzIjpbInFfczppcG9kIl19

{
  "numFound" : 1,
  "start" : 0,
  "maxScore" : 0.0,
  "docs" : [ {
    "id" : "4a6f7f5e-3d13-4f20-b59e-6188ce4c5783",
    "collection_s" : "bb_catalog",
    "q_txt" : [ "ipod" ],
    "q_s" : "ipod",
    "qtime_l" : 1,
    "totaltime_l" : 2,
    "numdocs_l" : 0,
    "timestamp_dt" : "2016-04-05T17:51:56.197Z",
    "httpmethod_s" : "POST",
    "req_debug_ss" : [ "true" ],
    "req_json.nl_ss" : [ "arrarr" ],
    "req_echoParams_ss" : [ "all" ],
    "req_lw.pipelineId_ss" : [ "default" ],
    "req_fl_ss" : [ "*,score" ],
    "req_start_ss" : [ "0" ],
    "req_isFusionQuery_ss" : [ "true" ],
    "req_rows_ss" : [ "10" ],
    "req_q_ss" : [ "ipod" ],
    "req_defType_ss" : [ "edismax" ],
    "req_qf_ss" : [ "doc_id_s" ],
    "req_wt_ss" : [ "json" ],
    "req_facet_ss" : [ "true" ],
    "_version_" : 1530793784714985472
  } ]
}

Run "topN" report over collection "bb_catalog", return top-ranking query, search field "q_txt":

> curl -u user:pass -X POST -H 'Content-type: application/json' -d @- \
> http://localhost:8764/api/apollo/reports/bb_catalog/topN \
> <<EOF
> { "num" : 1, "field" : "q_txt" }
> EOF

[ {
  "key" : "ipad",
  "count" : 42,
  "percentage" : 0.7118644,
  "token" : "eyJmaWx0ZXJzIjpbInFfdHh0OmlwYWQiXX0="
} ]

Run "topClicked" report, return 5 most-clicked documents:

> curl -u user:pass -X POST -H 'Content-type: application/json' -d @- \
> http://localhost:8764/api/apollo/reports/bb_catalog/topClicked \
> <<EOF
> {"num":5}
> EOF

[ {
  "key" : "2842056",
  "count" : 42636,
  "percentage" : 0.0107869385,
  "token" : "eyJmaWx0ZXJzIjpbInR5cGVfczpjbGljayIsImRvY19pZF9zOjI4NDIwNTYiXX0="
}, {
  "key" : "1945531",
  "count" : 23510,
  "percentage" : 0.0059480467,
  "token" : "eyJmaWx0ZXJzIjpbInR5cGVfczpjbGljayIsImRvY19pZF9zOjE5NDU1MzEiXX0="
}, {
  "key" : "2842092",
  "count" : 22683,
  "percentage" : 0.0057388153,
  "token" : "eyJmaWx0ZXJzIjpbInR5cGVfczpjbGljayIsImRvY19pZF9zOjI4NDIwOTIiXX0="
}, {
  "key" : "9225377",
  "count" : 21603,
  "percentage" : 0.0054655746,
  "token" : "eyJmaWx0ZXJzIjpbInR5cGVfczpjbGljayIsImRvY19pZF9zOjkyMjUzNzciXX0="
}, {
  "key" : "9755322",
  "count" : 20993,
  "percentage" : 0.005311244,
  "token" : "eyJmaWx0ZXJzIjpbInR5cGVfczpjbGljayIsImRvY19pZF9zOjk3NTUzMjIiXX0="
} ]

Get a histogram of the number of documents returned for queries over range 0 to 2000, interval 500 (4 bins):

> curl -u user:pass -X POST -H 'Content-type: application/json' -d @- \
> http://localhost:8764/api/apollo/reports/bb_catalog/histo \
> <<EOF
> {"field": "numdocs_l", "rangeStart": 0, "rangeEnd": 100, "interval": "25"}
> EOF

-X POST -H 'Content-type: application/json' -d @- http://localhost:8764/api/apollo/reports/bb_catalog/histo <<EOF
> {"field": "numdocs_l", "rangeStart": 0, "rangeEnd": 2000, "interval": 500 }
> EOF
[ {
  "key" : "0",
  "count" : 10,
  "percentage" : 0.16949153,
  "token" : "eyJmaWx0ZXJzIjpbIm51bWRvY3NfbDpbMCBUTyA1MDB9Il19"
}, {
  "key" : "500",
  "count" : 0,
  "percentage" : 0.0,
  "token" : "eyJmaWx0ZXJzIjpbIm51bWRvY3NfbDpbNTAwIFRPIDEwMDB9Il19"
}, {
  "key" : "1000",
  "count" : 39,
  "percentage" : 0.66101694,
  "token" : "eyJmaWx0ZXJzIjpbIm51bWRvY3NfbDpbMTAwMCBUTyAxNTAwfSJdfQ=="
}, {
  "key" : "1500",
  "count" : 0,
  "percentage" : 0.0,
  "token" : "eyJmaWx0ZXJzIjpbIm51bWRvY3NfbDpbMTUwMCBUTyAyMDAwfSJdfQ=="
} ]

Get a date histogram for the last two days, with an interval of 1 day:

> curl -u user:pass -X POST -H Content-type:application/json -d @- \
> http://localhost:8764/api/apollo/reports/bb_catalog/dateHisto \
> <<EOF
> {"dateRangeStart": "NOW/DAY-1DAY", "dateRangeEnd": "NOW/DAY+1DAY", "timeInterval": "+1DAY"}
> EOF

[ {
  "key" : "2016-04-04T00:00:00Z",
  "count" : 0,
  "percentage" : 0.0,
  "token" : "eyJmaWx0ZXJzIjpbInRpbWVzdGFtcF9kdDpbTk9XL0RBWS0xREFZIFRPIE5PVy9EQVkrMURBWV0iLCJ0aW1lc3RhbXBfZHQ6WzIwMTZcXC0wNFxcLTA0VDAwXFw6MDBcXDowMFogVE8gMjAxNi0wNC0wNFQwMDowMDowMForMURBWX0iXX0="
}, {
  "key" : "2016-04-05T00:00:00Z",
  "count" : 7,
  "percentage" : 1.0,
  "token" : "eyJmaWx0ZXJzIjpbInRpbWVzdGFtcF9kdDpbTk9XL0RBWS0xREFZIFRPIE5PVy9EQVkrMURBWV0iLCJ0aW1lc3RhbXBfZHQ6WzIwMTZcXC0wNFxcLTA1VDAwXFw6MDBcXDowMFogVE8gMjAxNi0wNC0wNVQwMDowMDowMForMURBWX0iXX0="
} ]