JSON Parsing Index Stage

A JSON Parsing Index stage (previously called the JSON Parser stage) parses JSON content from a document field into one or more new documents.

This stage uses Solr’s JsonRecordReader to create an index stage capable of splitting JSON into sub-documents. For details on the use of this stage in Solr, see this Lucidworks blog post: Indexing Custom JSON Data.

Example Specification, Data, Results

Stage Specification

{ "type": "json-parsing",
  "skip": false,
  "id": "json-parsing",
  "sourceField": "data",
  "splitPath": "/exams",
  "mappingRules": [
      {"path": "/first", "field": "first"},
      {"path": "/last", "field": "last"},
      {"path": "/grade", "field": "grade"},
      {"path": "/exams/subject", "field": "subject"},
      {"path": "/exams/test", "field": "test"},
      {"path": "/exams/marks", "field": "marks"}
  ]
}

Data

{
  "first": "John3",
  "last": "Doe",
  "grade": 8,
  "exams": [
      {
        "subject": "Maths",
        "test"   : "term1",
        "marks":90},
        {
         "subject": "Biology",
         "test"   : "term1",
         "marks":86}
      ]
}

Results

Parsing this data, using the splitPath "/exams" and the six mapping rules above, produces two documents, one for each object in the list of exams.

The first document has the following field, value pairs:

* first : John
* last : Doe
* grade : 8
* test : term1
* subject: Maths
* marks : 90

The second has the following field, value pairs:

* first : John
* last : Doe
* grade : 8
* test : term1
* subject: Biology
* marks : 86

Configuration