4

I have a deeply nested document(pseudo structure as shown below):

[{
    "id": "1",
    "company_id": "1",
    "company_name": "company_1",
    "departments":[{
                 "dep1" : [{
                           "id" : 40,
                           "name" : xyz
                           },
                           {
                           "id" : 41,
                           "name" : xyr
                           }],
                "dep2": [{
                }]
            }]
    "employeePrograms" :[{
    }]
}]

How can I index these type of documents in Apache Solr? Documentation gives the idea of immediate child documents alone.

Altenrion
  • 764
  • 3
  • 14
  • 35

1 Answers1

1

Unfortunatelly i'm don't have huge experience with this technology, but want to help. Here is some official documentation, that might be useful: oficial doc more specific

If you have some uncommon issue, tell about it, maybe any error, or whatever.. I would try my best to help)

Upd1 : Solr can only maintain a 'flat' representation of the data. What you weretrying to do is not really possible. There are a number of workarounds, such as using dynamic fields and using a solr join to link multiple data sets.

Speking about a deep nesting ? I've found such an example of work around. If you had something like that:

 "docs": [
      {
        "name": "Product Name",
        "categories": [
          {
            "name": "Category 1",
            "priority": 8
          },
          {
            "name": "Category 2",
            "priority": 6
          }
          ...
        ]
      },

You have to modify it like that to make it not deeply nested :

 "docs": [
    {
      name: "Sample Product"
      categories: [
      {
        priority_category: "9_Category 1",
      },
      {
        priority_category: "5_Category 2",
      }
      ...
      ]
    },

So, you've done something similar, check if there are any errors anywhere

Community
  • 1
  • 1
Altenrion
  • 764
  • 3
  • 14
  • 35
  • 1
    Thanks @Altenrion. Really appreciate your help. Documentation gives the idea of one level of child documents alone. But I have multiple levels of child documents. Please refer the psuedo structure in my question. – user3584146 Feb 22 '16 at 04:48
  • I indexed the nested doc's successfully in Solr.[{ "id": "1", "company_id": "1", "company_name": "company_1", "content_type":"parentDocument", "_childDocuments_":[{ "content_type":"dep", "_childDocuments_" : [{ "id" : 40, "dep_name" : xyz }, { "id" : 41, "dep_name" : xyr }, { "id" : 22, "emp_program": zzz }] }] }] – user3584146 Feb 22 '16 at 11:24
  • When i want to get back the doc's(parent and child documents as a single document), I get the parent and the respective child doc;s using the query. **q={!parent which=content_type:parentDocument}&fl=[child parentFilter=content_type:parentDocument]**. But the document nested structure is lost. I get only the flat list of child documents(Even the grandchildren of a parent are given as child of a parent). Please let me know how can i get or form the nested structure of the document in the results? – user3584146 Feb 22 '16 at 11:34
  • I have updated the answer, instead of writing the new one. let me know if it is helpful or not – Altenrion Feb 22 '16 at 20:20
  • Unfortunately, I cannot flatten the document.Maybe Solr is not suitable for my use-case(deeply nested JSON documents). I found Elasticsearch inherently supports deeply nested structures and fits for my use-case. Anyway thanks for the help. – user3584146 Feb 23 '16 at 08:05