Indexing Custom JSON Data

by Noble Paul
August 12, 2014

Solr already supports update requests in JSON format. But it supports only Solr json format and not your own custom JSON. Now (with SOLR-6304 , version 4.10 onwards ), Solr supports any JSON document and the document can be indexed in the required format in Solr.

Transforming and Indexing custom JSON data

The objective of this feature is to help users index any JSON into a valid Solr document according to the users preference. It lets the user to split a single JSON file into 1 or more Solr documents. The final indexed document can be controlled using the mapping passed along the request . One or more valid JSON documents can be sent to the /update/json/docs path with the configuration params.

Mapping params

split : This parameter is required if you wish to transform the input JSON . This is the path at which the JSON must be split . If the entire JSON makes a single solr document , the path must be “/” .
f : This is a multivalued mapping parameter . At least one field mapping must be provided . The format of the parameter is {target-field-name}:{json-path} . The ‘json-path’ is a required part . target-field-name is the name of the field in the input Solr document. It is optional and it is automatically derived from the input json
echo : This is for debugging. set it to true , if you want the docs to be returned as a response. Nothing will be indexed

example 1:

curl 'http://localhost:8983/solr/collection1/update/json/docs'
'?split=/exams'
'&f=first:/first'
'&f=last:/last'
'&f=grade:/grade'
'&f=subject:/exams/subject'
'&f=test:/exams/test'
'&f=marks:/exams/marks'
 -H 'Content-type:application/json' -d '
{
  "first": "John",
  "last": "Doe",
  "grade": 8,
  "exams": [
      {
        "subject": "Maths",
        "test"   : "term1",
        "marks":90},
        {
         "subject": "Biology",
         "test"   : "term1",
         "marks":86}
      ]
}'

This indexes the following two docs

   {
      "first":"John",
      "last":"Doe",
      "marks":90,
      "test":"term1",
      "subject":"Maths",
      "grade":8
      }
    {
      "first":"John",
      "last":"Doe",
      "marks":86,
      "test":"term1",
      "subject":"Biology",
      "grade":8
      }

As the final field names are the same as the input document fields, the request can be simplified as,

example 2 :

curl 'http://localhost:8983/solr/collection1/update/json/docs'
'?split=/exams'
'&f=/first'
'&f=/last'
'&f=/grade'
'&f=/exams/subject'
'&f=/exams/test'
'&f=/exams/marks'
 -H 'Content-type:application/json' -d '
{
  "first": "John",
  "last": "Doe",
  "grade": 8,
  "exams": [
      {
        "subject": "Maths",
        "test"   : "term1",
        "marks":90},
        {
         "subject": "Biology",
         "test"   : "term1",
         "marks":86}
      ]
}'

Wildcards

Instead of specifying all the field names explicitly , it is possible to specify a wildcard “*” or a wildwildcard “**” to map fields automatically. The constraint is that wild cards can be only used in the end of the json-path. The split path cannot use wildcards. The following are example wildcard path mappings

f=/docs/* : maps all the fields under docs and in the name as given in json
f=/docs/** : maps all the fields under docs and its children in the name as given in json
f=searchField:/docs/* : maps all fields under /docs to a single field called ‘searchField’
f=searchField:/docs/** : maps all fields under /docs and its children to searchField

With wildcards we can simplify our previous example as follows

example 3:

'curl 'http://localhost:8983/solr/collection1/update/json/docs'
'?split=/exams'
'&f=/**'
 -H 'Content-type:application/json' -d '
{
  "first": "John",
  "last": "Doe",
  "grade": 8,
  "exams": [
      {
        "subject": "Maths",
        "test"   : "term1",
        "marks":90},
        {
         "subject": "Biology",
         "test"   : "term1",
         "marks":86}
      ]
}'

It is also possible to send all the values to a single field and do a full text search on that . This is a good option to blindly index and query JSON documents without worrying about fields and schema

example 4 :

'curl 'http://localhost:8983/solr/collection1/update/json/docs'
'?split=/'
'&f=txt:/**'
 -H 'Content-type:application/json' -d '
{
  "first": "John",
  "last": "Doe",
  "grade": 8,
  "exams": [
      {
        "subject": "Maths",
        "test"   : "term1",
        "marks":90},
        {
         "subject": "Biology",
         "test"   : "term1",
         "marks":86}
      ]
}'

About Noble Paul

LEARN MORE

Contact us today to learn how Lucidworks can help your team create powerful search and discovery applications for your customers and employees.

Fusion Platform Overview

Fusion Platform Pricing

AI Hub

Lucidworks Features and capabilities (all Included)

Product Discovery

Searchandising

Site Search

Workplace Search

Ingest Data and Capture Signals

Employee Search Experience

Customer Service and Case Resolution

AI and Large Language Models

Solutions

Commerce

Customer Service

Knowledge Management

Industries

Retail

Government and Public Sector

Healthcare

B2B Commerce and Distribution

B2B Manufacturing

Financial Services

EXPLORE OUR CONTENT

Ebooks & Reports

Blog

Videos

Press

Resources

About Lucidworks

Documentation

Careers

LucidAcademy

Contact Us

Technical Support

Indexing Custom JSON Data

Transforming and Indexing custom JSON data

Mapping params

About Noble Paul

LEARN MORE

Fusion Platform Overview

Fusion Platform Pricing

AI Hub

Lucidworks Features and capabilities (all Included)

Product Discovery

Searchandising

Site Search

Workplace Search

Ingest Data and Capture Signals

Employee Search Experience

Customer Service and Case Resolution

AI and Large Language Models

Solutions

Commerce

Customer Service

Knowledge Management

Industries

Retail

Government and Public Sector

Healthcare

B2B Commerce and Distribution

B2B Manufacturing

Financial Services

EXPLORE OUR CONTENT

Ebooks & Reports

Blog

Videos

Press

Resources

About Lucidworks

Documentation

Careers

LucidAcademy

Contact Us

Technical Support

Transforming and Indexing custom JSON data

Mapping params

About Noble Paul

Related Articles

LEARN MORE