OPENJSON does not select all documents into the SQL table

Question

I have been trying to export the contents of a JSON file to an SQL Server table. However, despite the presence of multiple rows in the JSON, the output SQL table consists of only the first row from the JSON. The code I am using is as follows:

DROP TABLE IF EXISTS testingtable;

DECLARE @json VARCHAR(MAX) = '{ "_id" : "01001", "city" : "AGAWAM", "loc" : [ -72.622739, 42.070206 ], "pop" : 15338, "state" : "MA" }, 
                              { "_id" : "01002", "city" : "CUSHMAN", "loc" : [ -72.51564999999999, 42.377017 ], "pop" : 36963, "state" : "MA" }';
SELECT * INTO testingtable FROM OPENJSON(@json) WITH (_id int, city varchar(20), loc float(50), pop int, state varchar(5)
)

SELECT * FROM testingtable

And the output obtained is as follows: Click to view

That's a *single* object Why would `OPENJSON` return multipler rows? — Panagiotis Kanavos, Mar 13 '19 at 12:41
Single object or invalid JSON. You can't have two root objects in a JSON document — Panagiotis Kanavos, Mar 13 '19 at 12:42
I'd be glad if it could be explained in a little more detail as to how there is only one JSON object present, as I am new to JSON. Or maybe explain how I can declare multiple JSON objects in the above script. — Darsh Khetan, Mar 13 '19 at 12:48
That appears to be an array missing its delimiters. Wrap that string in `[` and `]` and you'll get two objects. — Jeroen Mostert, Mar 13 '19 at 12:49
@DarshKhetan it's *invalid* JSON and OPENJSON just read the first valid part, discarding the rest. You can't have multiple root objects. You **can't** have `{..},{..}` in a JSON document. Multiple items means an array, ie `[{..},{..}]` — Panagiotis Kanavos, Mar 13 '19 at 12:49
The solution stated by @Jeroen Mostert works perfectly! Thank you! — Darsh Khetan, Mar 13 '19 at 12:51

score 0 · Answer 1 · answered Mar 13 '19 at 12:53

The string isn't valid JSON. You can't have two root objects in a JSON document. Properly formatted, the JSON string looks like this

DECLARE @json VARCHAR(MAX) = '{ "_id" : "01001", "city" : "AGAWAM", "loc" : [ -72.622739, 42.070206 ], "pop" : 15338, "state" : "MA" },
                              { "_id" : "01002", "city" : "CUSHMAN", "loc" : [ -72.51564999999999, 42.377017 ], "pop" : 36963, "state" : "MA" }';

It should be

DECLARE @json VARCHAR(MAX) = '[{ "_id" : "01001", "city" : "AGAWAM", "loc" : [ -72.622739, 42.070206 ], "pop" : 15338, "state" : "MA" },
                               { "_id" : "01002", "city" : "CUSHMAN", "loc" : [ -72.51564999999999, 42.377017 ], "pop" : 36963, "state" : "MA" }
                              ]';

Looks like OPENJSON parsed the first object and stopped as soon as it encounterd the invalid text.

The quick & dirty way to fix this would be to add the missing square brackets :

SELECT * FROM OPENJSON('[' + @json + ']') WITH (_id int, city varchar(20), loc float(50), pop int, state varchar(5))

I suspect that string came from a log or event file that stores individual records in separate lines. That's not valid, nor is there any kind of standard or specification for this (name squatters notwithstanding) but a lot of high-traffic applications use this, eg in log files or event streaming.

The reason they do this is that there's no need to construct or read the entire array to get a record. It's easy to just append a new line for each record. Reading huge files and processing them in parallel is also easier - just read the text line by line and feed it to workers. Or split the file in N parts to the nearest newline and feed individual parts to different machines. That's how Map-Reduce works.

That's why adding the square brackets is a dirty solution - you have to read the entire text from a multi-MB or GB-sized file before you can parse it. That's not what OPENJSON was built to do.

The proper solution would be to read the file line-by-line using another tool, parse the records and insert the values into the target tables.

score 0 · Answer 2 · answered Mar 13 '19 at 13:33

If you know the JSON docs will not contain any internal newline characters, you can split the string with string_split. OPENJSON doesn't care about leading whitespace or trailing ,. That way you avoid adding the [ ] characters, and don't have to parse it as one big document.

EG:

DROP TABLE IF EXISTS testingtable;

DECLARE @jsonFragment VARCHAR(MAX) = '{ "_id" : "01001", "city" : "AGAWAM", "loc" : [ -72.622739, 42.070206 ], "pop" : 15338, "state" : "MA" }, 
                              { "_id" : "01002", "city" : "CUSHMAN", "loc" : [ -72.51564999999999, 42.377017 ], "pop" : 36963, "state" : "MA" }';

SELECT * 
INTO testingtable 
FROM  string_split(@jsonFragment,CHAR(10))  docs
cross apply 
(
  select *
  from openjson(docs.value)
  WITH (_id int, city varchar(20), loc float(50), pop int, state varchar(5))
) d

SELECT * FROM testingtable

This format is what you might call a "JSON Fragment", by analogy with XML. And this is another difference between XML and JSON in SQL Server. For XML the engine is happy to parse and store XML Fragments, but not with JSON.

score 0 · Accepted Answer · answered Mar 13 '19 at 13:40

A multiline JSON text is enclosed in a square bracket, for example;

[
    {first data set},
    {second data set}, .....
]

You can either add square brackets while passing data to this query or else you can add square brackets to your @json variable (eg. '['+ @json + ']')

DECLARE @json VARCHAR(MAX) = '{ "_id" : "01001", "city" : "AGAWAM", "loc" : [ -72.622739, 42.070206 ], "pop" : 15338, "state" : "MA" }, 
                              { "_id" : "01002", "city" : "CUSHMAN", "loc" : [ -72.51564999999999, 42.377017 ], "pop" : 36963, "state" : "MA" }';
SELECT * INTO testingtable FROM OPENJSON ('['+ @json + ']') WITH (_id int, city varchar(20), loc float(50), pop int, state varchar(5)
)

SELECT * FROM testingtable

OPENJSON does not select all documents into the SQL table

3 Answers3