0

I have MS SQL Server (2014) and what I need is to pull a bunch of JSON metadata off a web API. Unfortunately the API is limited to returning only 30 records (and I have 7400 records to pull). So I was going to just iterate over a bunch of URLs and import the data but I got stuck.

I have imported JSON data into SQL Server in the past adapting this method (Passing filename as variable in OPENROWSET(BULK filename)) and using parseJSON

This method works fine if I have the data local to my machine (e.g. on my C drive) but does not work when I use the URL - I get the following error:

(1 row(s) affected)
Msg 4861, Level 16, State 1, Line 1
Cannot bulk load because the file "https://api.stackexchange.com/2.2/search?order=desc&sort=activity&intitle=perl&site=stackoverflow" could not be opened. Operating system error code 123(The filename, directory name, or volume label syntax is incorrect.).
Msg 8115, Level 16, State 2, Line 45
Arithmetic overflow error converting expression to data type nvarchar.

How can I fix my code to work with a file on the internet?

--    SET @JSON_FILE= 'C:\Temp\SO.json'  --  If I save the data on my harddrive the following code works.
    SET @JSON_FILE= 'https://api.stackexchange.com/2.2/search?order=desc&sort=activity&intitle=perl&site=stackoverflow'

    /* http://www.sqlservercentral.com/Forums/Topic982066-338-1.aspx
    Read the JSON file into the varchar variable.  This is done via a bulk insert using the OPENROWSET() function.   Because this stored proc is to be re-used with different JSON files, ideally you want to pass the JSON file path as a variable.  However, because the OPENROWSET() function won't accept variables as a parameter, the command needs to be built as a string and then passed to the sp_executesql system stored procedure.  The results are then passed back by an output variable.
    */
    -- Setup varchar variable to be used to hold contents of JSON file.
    DECLARE @txt varchar(max) 

    -- The command line
    DECLARE @COMMAND NVARCHAR(MAX) 
    -- The definition of the parameters used within the command line
    DECLARE @PARAM_DEF NVARCHAR(500)
    -- The parameter used to pass the file name into the command
    DECLARE @FILEVAR NVARCHAR(MAX)
    -- The output variable that holds the results of the OPENROWSET()
    DECLARE @JSON_OUT VARCHAR(MAX) 

    SET @FILEVAR = @JSON_FILE
    SET @PARAM_DEF = N'@JSON_FILE NVARCHAR(MAX), @JSON_OUT VARCHAR(MAX) OUTPUT'
    SET @COMMAND = N'SELECT @JSON_OUT = BulkColumn FROM OPENROWSET(BULK ''' +  @JSON_FILE + ''', SINGLE_BLOB) ROW_SET';

    EXEC sp_executesql @COMMAND, @PARAM_DEF, @JSON_FILE = @FILEVAR, @JSON_OUT = @txt OUTPUT;
    --parseJSON function from https://www.simple-talk.com/sql/t-sql-programming/consuming-json-strings-in-sql-server/
    SELECT * FROM parseJSON(@txt)
user918967
  • 2,049
  • 4
  • 28
  • 43

2 Answers2

0

If you upgrade to SQL Server 2016, you can use openjson, but in 2014, I think you're stuck with either hoping for some magic with openrowset (Though I've yet to see an example of it working remotely) or, other options include SQL-CLR assemblies, which are summed up nicely by marc_s here.

-- Yes, the issue manifests specifically with read-only URL access.

Community
  • 1
  • 1
SLWS
  • 506
  • 4
  • 8
  • However, it is not the JSON part that fails, its the web part. The code works on a local JSON file but not on one hosted on the web... – user918967 Mar 28 '17 at 21:42
  • Those suggestions by @marc_s require VisualStudio and/or .NET - is there any other tool? – user918967 Mar 28 '17 at 22:04
0

It turns out it was infinitely easier just to use Microsoft Powershell to download all the JSONs and then just use my code to import the data in.

I set up a simple spreadsheet with

  1. Column A going from 1 to the end,
  2. Column B adding 30 to the sequence
  3. Column C to a formula that makes the URL string =CONCAT("$resource",A2,"='myURL/retrieve.php?query=&coll=cxr&m=",B2,"&n=",B2+30,"'")
  4. Column D then queries the site with the URL string =CONCAT(" Invoke-RestMethod -Method Get -Uri $resource",A2," -Outfile ",A2,".json")

Then I just copy/paste all of Column C (to seed $resourceN) then copy/paste Column D to download the JSON to be injested into SQL.

user918967
  • 2,049
  • 4
  • 28
  • 43