0

I have been trying to export the contents of a JSON file to an SQL Server table. However, despite the presence of multiple rows in the JSON, the output SQL table consists of only the first row from the JSON. The code I am using is as follows:

DROP TABLE IF EXISTS testingtable;

DECLARE @json VARCHAR(MAX) = '{ "_id" : "01001", "city" : "AGAWAM", "loc" : [ -72.622739, 42.070206 ], "pop" : 15338, "state" : "MA" }, 
                              { "_id" : "01002", "city" : "CUSHMAN", "loc" : [ -72.51564999999999, 42.377017 ], "pop" : 36963, "state" : "MA" }';
SELECT * INTO testingtable FROM OPENJSON(@json) WITH (_id int, city varchar(20), loc float(50), pop int, state varchar(5)
)

SELECT * FROM testingtable

And the output obtained is as follows: Click to view

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236

3 Answers3

0

The string isn't valid JSON. You can't have two root objects in a JSON document. Properly formatted, the JSON string looks like this

DECLARE @json VARCHAR(MAX) = '{ "_id" : "01001", "city" : "AGAWAM", "loc" : [ -72.622739, 42.070206 ], "pop" : 15338, "state" : "MA" },
                              { "_id" : "01002", "city" : "CUSHMAN", "loc" : [ -72.51564999999999, 42.377017 ], "pop" : 36963, "state" : "MA" }';

It should be

DECLARE @json VARCHAR(MAX) = '[{ "_id" : "01001", "city" : "AGAWAM", "loc" : [ -72.622739, 42.070206 ], "pop" : 15338, "state" : "MA" },
                               { "_id" : "01002", "city" : "CUSHMAN", "loc" : [ -72.51564999999999, 42.377017 ], "pop" : 36963, "state" : "MA" }
                              ]';

Looks like OPENJSON parsed the first object and stopped as soon as it encounterd the invalid text.

The quick & dirty way to fix this would be to add the missing square brackets :

SELECT * FROM OPENJSON('[' + @json + ']') WITH (_id int, city varchar(20), loc float(50), pop int, state varchar(5))

I suspect that string came from a log or event file that stores individual records in separate lines. That's not valid, nor is there any kind of standard or specification for this (name squatters notwithstanding) but a lot of high-traffic applications use this, eg in log files or event streaming.

The reason they do this is that there's no need to construct or read the entire array to get a record. It's easy to just append a new line for each record. Reading huge files and processing them in parallel is also easier - just read the text line by line and feed it to workers. Or split the file in N parts to the nearest newline and feed individual parts to different machines. That's how Map-Reduce works.

That's why adding the square brackets is a dirty solution - you have to read the entire text from a multi-MB or GB-sized file before you can parse it. That's not what OPENJSON was built to do.

The proper solution would be to read the file line-by-line using another tool, parse the records and insert the values into the target tables.

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
0

If you know the JSON docs will not contain any internal newline characters, you can split the string with string_split. OPENJSON doesn't care about leading whitespace or trailing ,. That way you avoid adding the [ ] characters, and don't have to parse it as one big document.

EG:

DROP TABLE IF EXISTS testingtable;

DECLARE @jsonFragment VARCHAR(MAX) = '{ "_id" : "01001", "city" : "AGAWAM", "loc" : [ -72.622739, 42.070206 ], "pop" : 15338, "state" : "MA" }, 
                              { "_id" : "01002", "city" : "CUSHMAN", "loc" : [ -72.51564999999999, 42.377017 ], "pop" : 36963, "state" : "MA" }';

SELECT * 
INTO testingtable 
FROM  string_split(@jsonFragment,CHAR(10))  docs
cross apply 
(
  select *
  from openjson(docs.value)
  WITH (_id int, city varchar(20), loc float(50), pop int, state varchar(5))
) d

SELECT * FROM testingtable

This format is what you might call a "JSON Fragment", by analogy with XML. And this is another difference between XML and JSON in SQL Server. For XML the engine is happy to parse and store XML Fragments, but not with JSON.

David Browne - Microsoft
  • 80,331
  • 6
  • 39
  • 67
0

A multiline JSON text is enclosed in a square bracket, for example;

[
    {first data set},
    {second data set}, .....
]

You can either add square brackets while passing data to this query or else you can add square brackets to your @json variable (eg. '['+ @json + ']')

DECLARE @json VARCHAR(MAX) = '{ "_id" : "01001", "city" : "AGAWAM", "loc" : [ -72.622739, 42.070206 ], "pop" : 15338, "state" : "MA" }, 
                              { "_id" : "01002", "city" : "CUSHMAN", "loc" : [ -72.51564999999999, 42.377017 ], "pop" : 36963, "state" : "MA" }';
SELECT * INTO testingtable FROM OPENJSON ('['+ @json + ']') WITH (_id int, city varchar(20), loc float(50), pop int, state varchar(5)
)

SELECT * FROM testingtable
Jiten Parmar
  • 26
  • 1
  • 7