0

I am trying to extract a large amount of Data from a json document. There are about 1500 nodes per json document. When I attempt to load the body node I get the 128KB limit error. I have found a way to load the node but I have to go all the way down to the array list. JsonExtractor("body.nprobe.items[*]"); The issue I am having is that I cannot access any other part of the json document, I need to get the meta data, like: Id, SerialNumber etc. Should the json file be changed in some way? The data I need is 3 levels down. The json has be obfuscated and shortened, the real file is about 33K lines of formatted json about 1500 items with n-20 fields in each.

 {
    "headers": {
        "pin": "12345",
        "Type": "URJ201W-GNZGK",
        "RTicks": "3345",
        "SD": "211",
        "Jov": "juju",
        "Market": "Dal",
        "Drst": "derre",
        "Model": "qw22",
        "DNum": "de34",
        "API": "34f",
        "Id": "821402444150002501"
    },
    "Id": "db5aacae3778",
    "ModelType": "URJ",
    "body": {
        "uHeader": {
            "ID": "821402444150002501",
            "SerialNo": "ee028861",
            "ServerName": "BRXTTY123"
        },
        "header": {
            "form": 4,
            "app": 0,
            "Flg": 1,
            "Version": 11056,
            "uploadID": 1,
            "uDate": "2016-04-14T18:29"
        },
        "nprobe": {
            "items": [{
                "purchaseDate": "2016-04-14T18:21:09",
                "storeLoc": {
                    "latitude": 135.052335,
                    "longitude": 77.167005
                },
                "sr": {
                    "ticks": 3822,
                    "SkuId": "24",
                    "Data": {
                        "X": "0.00068",
                        "Y": "0.07246",

                    }
                }
            },
            {
                "purchaseDate": "2016-04-14T18:21:09",
                "storeLoc": {
                    "latitude": 135.052335,
                    "longitude": 77.167005
                },
                "sr": {
                    "ticks": 3823,
                    "SkuId": "25",
                    "Data": {
                        "X": "0",
                        "Y": "2",

                    }
                }
            }]
        }
    },
    "Messages": []
}

Thanks.

Bharat
  • 2,441
  • 3
  • 24
  • 36

1 Answers1

0

You'll have to use CROSS APPLY: https://msdn.microsoft.com/en-us/library/azure/mt621307.aspx and EXPLODE: https://msdn.microsoft.com/en-us/library/azure/mt621306.aspx

See a worked out solution here:

https://github.com/algattik/USQLHackathon/blob/master/VS-Solution/USQLApplication/ciam-to-sqldw.usql https://github.com/algattik/USQLHackathon/blob/master/Samples/Customer/customers%202016-08-10.json

-- IMPROVED ANSWER: --

As this solution will not work for you since your inner JSON is too large to fit in a string, you can parse the input twice:

DECLARE @input string = @"/so.json";

REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];

@meta = 
EXTRACT Id string
  FROM @input 
USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();

@items = 
EXTRACT purchaseDate string
  FROM @input 
USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor("body.nprobe.items[*]");

@itemsFull =
    SELECT Id,
           purchaseDate
    FROM @meta
         CROSS JOIN @items;

OUTPUT @itemsFull
TO "/items_full.csv"
USING Outputters.Csv();
  • I have tried the Cross Apply Explode method, the issue is that the child-node I need to access is located at body->nprobe->items->sr. I cannot get the body node because it overflows the 128kb limit. So far the closest I can get is to Map directly to the `code`body.nprobe.items[*]`code`, this however doesn't allow me to access the other data in the json fragment. Thank you. – Carolus Holman Dec 02 '16 at 12:24