BCP -w copying 0 rows without error message

Question

I am trying to ingest the following .txt file with content

2|Jürgen
3|Jürgen
4|Jürgen

delimited by |. The 2nd column is accented letters, as I want to test sql table preserving accented letters.

The sql table to be ingested are defined as

CREATE TABLE [stage].[bcp_test](
[ID] [int] NOT NULL,
[TERM] [nvarchar](100) COLLATE SQL_Latin1_General_CP1_CS_AS NOT NULL)

If I do a simple insertion, insert into stage.bcp_test values(1,'Jürgen') I can see the accent Jürgen preserved in the table.

But when I tried to insert the 3 rows in the sample file through bcp

bcp stage.bcp_test in "test.txt" -S dbservername -c -t "|" -T -F 1 -b 20000 -m 1 -e error.txt

Jürgen becomes J├╝rgen.

Based on some suggestions, I change -c to -t. So the new bcp command becomes

bcp stage.bcp_test in "test.txt" -S dbservername -w -t "|" -T -F 1 -b 20000 -m 1 -e error.txt

However, 0 rows are copied without any error message.

Is there anything that I missed? I tried to search different places but cannot find a solution. Is it okay for me to bcp a .txt file using bcp -w?

=========================== Regarding Pa1's suggestion,what I got in my sql server is as follow:

Slightly different from his post, maybe because of sql version?

My Sql Server Version:

Microsoft SQL Server 2016 (SP2) (X64)

===========

Add on: Pa1's solution works. Depends on how the file is encoded.

https://stackoverflow.com/questions/9220450/using-bcp-to-import-data-to-sql-server-while-preserving-accents-asian-character Here links to a very old post suggesting using bcp -w to fix the accented letters issue. But there is not much follow-up suggesting whether the issues are resolved. — YanTianqi, Mar 01 '19 at 09:13
Pa1's solution works, without use -w. But I am still a little confused a bit this -c and -w thing. Welcome answers. — YanTianqi, Mar 01 '19 at 10:45

Pa1 · Answer 1 · 2019-03-01T11:27:03.200

3

-w expects tab separated data based on bcp documentation. So replace pipes with tab and try?
To ensure that text is inserted properly, try specifying the code page in your bcp command

-C { ACP | OEM | RAW | code_page }

If you are using the -C code_page option then you need to match the code_page to the encoding of the file you are trying to ingest. The complete list is available here

Here is the trial on my system:

bcp stage.bcp_test in "test.txt" -S dbservername -c -C ACP -t "|" -T -F 1 -b 20000 -m 1 -e error.txt

bcp stage.bcp_test in "test.txt" -S dbservername -c -C 65001 -t "|" -T -F 1 -b 20000 -m 1 -e error.txt

bcp stage.bcp_test in "test.txt" -S dbservername -w -T -F 1 -b 20000 -m 1 -e error.txt

edited Mar 01 '19 at 11:27

answered Mar 01 '19 at 09:20

Pa1

Hi, thanks for suggesting. using -c does work. However, the premise is to preserve the accented letters, while using -c simply ignores that requirement. So the discussion is still to use -w, the unicode bulk insertion. While keeping -w, adding a -C ACP did not solve my problems. – YanTianqi Mar 01 '19 at 10:25
Following on, oh, I see your point of using -c -C ACP. The result is better, but still slightly off, i.e. not preserving the exact accents. I cannot attach picture here.Will pose in my main question body with the results. – YanTianqi Mar 01 '19 at 10:31
Hi @YanTianqi. The sample file I used was using ANSI encoding. If the encoding is UTF-8 then you are right it won't work. For UTF-8 you need to specify `-C 65001`. Could this be the fix? – Pa1 Mar 01 '19 at 10:38
1

Hi Pa1, great, by using -c 65001, I manage to preserve the accented letters. Thanks. – YanTianqi Mar 01 '19 at 10:44

1 Answers1