Encoding option CODEPAGE not working for OPENROWSET in MSSQL Server

Question

I am trying to migrate data from Postgres database to SQL server. For this, I am exporting the Postgres data into JSON format, and the idea is to insert the data into my tables (I already have the schema from Postgres database imported into SQL server).

I am playing around with the OPENROWSET function, but there seem to be problems with the encoding. I noticed in the documentation that there is a CODEPAGE option that should solve this problem, but it doesn't do anything in the way I am specifying it.

This is my example query:

 SELECT users.*
 FROM OPENROWSET (BULK 'path_to_some.json', SINGLE_CLOB, CODEPAGE = '65001') as j
 CROSS APPLY OPENJSON(BulkColumn)
 WITH(id int, 
 first_name nvarchar(max),
 last_name nvarchar(max), 
 email nvarchar(max), 
 accepts_marketing bit, 
 created_at datetime, 
 updated_at datetime) AS users
 order by id

Does anyone have an idea what am I missing here?

Why don't you use an Postgres driver to talk from one database to the other? Why involve Json at all? It doesn't offer any benefit over CSV anyway. BTW `CODEPAGE` is for ASCII text. 65001 is Unicode, UTF8 specifically — Panagiotis Kanavos, May 12 '17 at 14:22
You could use SSIS or the `Import Data` wizard to load data from Postgres directly into SQL Server. It's faster than setting up a linked server — Panagiotis Kanavos, May 12 '17 at 14:25
@PanagiotisKanavos I actually explored that option, but it seems too complicated to me. I want to create a one-time export from Postgres to MSSQL. I need something simple here, as the task seems as something that should be straightforward. — iliyan tanev, May 12 '17 at 14:39
The Import Data wizard is also something that I have tried. It is a very nice tool, very simple, but the problem with it is the column delimiter. I have a relatively large amount of data to migrate. In the data, there are all kinds of special symbols. I have tried every special symbol I could think of and there are always some text records that contain this symbol and the import operation could not perform a clean import. From what I saw it is not possible to specify more complex delimiter than one character. — iliyan tanev, May 12 '17 at 14:44
What column delimiter? You don't need any delimiter if you read data from Postgres and transfer it to SQL Server. A for "very simple", on the contrary, it's an entire ETL framework. The wizard generates only a direct mapping using that framework. If you have a lot of data, direct transfer with SSIS will be a LOT faster that first exporting to JSON, then importing the text file — Panagiotis Kanavos, May 12 '17 at 14:51
Ok, Thanks @PanagiotisKanavos. I installed the postgre native driver and got the provider in the SQL Server Import and Export Wizard :) The delimiter was from Flat File Source. I was using that until now. Unfortunately, there are still problems. I opened a [**new question**](http://stackoverflow.com/questions/43975322/sql-server-import-and-export-wizard-and-postgresql) about this. — iliyan tanev, May 15 '17 at 09:06

score 1 · Answer 1 · answered Apr 23 '19 at 17:35

I think CODEPAGE (bulk_option) is not working with SINGLE_CLOB, but only with FORMATFILE option. From OPENROWSET spec:

{ FORMATFILE = 'format_file_path' [ <bulk_options> ]
| SINGLE_BLOB | SINGLE_CLOB | SINGLE_NCLOB }

Should read:

{ (FORMATFILE = 'format_file_path' optionally <bulk_options>) or SINGLE_BLOB or SINGLE_CLOB or SINGLE_NCLOB }

Encoding option CODEPAGE not working for OPENROWSET in MSSQL Server

1 Answers1