I'm trying to load a set of public flat files (using COPY INTO
from Python) - that apparently are saved in ANSI
format. Some of the files load with no issue, but there is at least one case where the COPY INTO
statement hangs (no error is returned, & nothing is logged, as far as I can tell). I isolated the error to a particular row with a non-standard character e.g., the ¢
character in the 2nd row -
O}10}49771}2020}02}202002}4977110}141077}71052900}R }}N}0}0}0}0}0}0}0}0}0}0}0}0}0}0}0}0}08}CWI STATE A}CENTENNIAL RESOURCE PROD, LLC}PHANTOM (WOLFCAMP)
O}10}50367}2020}01}202001}5036710}027348}73933500}R }}N}0}0}0}0}0}0}0}0}0}0}0}0}0}0}0}0}08}A¢C 34-197}APC WATER HOLDINGS 1, LLC}QUITO, WEST (DELAWARE)
Re-saving these rows into a file with UTF-8 encoding solves the issue, but I thought I'd pass this along in case someone wants to take a look at the back-end to handle these types of characters and/or return some kind of error.