Error (Hung process?) when using COPY INTO with ANSI file

Question

I'm trying to load a set of public flat files (using COPY INTO from Python) - that apparently are saved in ANSI format. Some of the files load with no issue, but there is at least one case where the COPY INTO statement hangs (no error is returned, & nothing is logged, as far as I can tell). I isolated the error to a particular row with a non-standard character e.g., the ¢ character in the 2nd row -

O}10}49771}2020}02}202002}4977110}141077}71052900}R }}N}0}0}0}0}0}0}0}0}0}0}0}0}0}0}0}0}08}CWI STATE A}CENTENNIAL RESOURCE PROD, LLC}PHANTOM (WOLFCAMP)


O}10}50367}2020}01}202001}5036710}027348}73933500}R }}N}0}0}0}0}0}0}0}0}0}0}0}0}0}0}0}0}08}A¢C 34-197}APC WATER HOLDINGS 1, LLC}QUITO, WEST (DELAWARE)

Re-saving these rows into a file with UTF-8 encoding solves the issue, but I thought I'd pass this along in case someone wants to take a look at the back-end to handle these types of characters and/or return some kind of error.

score 0 · Answer 1 · answered May 17 '21 at 19:37

0

Why do you save into a file?

If it is possible, just play with strings internally from Python with:

resultstr= bytestr.encode("utf-8")

answered May 17 '21 at 19:37

Benoît Bottemanne

149
1
8

Hello, I am trying to load a source file - it is 9 GB in size (approximately 70 million rows). The COPY INTO statement is expecting a file - hence the need to adjust the encoding of the file. – tylerc May 18 '21 at 20:27

Error (Hung process?) when using COPY INTO with ANSI file

1 Answers1