0

I have a pipe that copies into a table with transformations, such us

COPY INTO t1 (
    col1,
    col2)
FROM (
    SELECT 
        t.$1 AS col1,
        t.$2 AS col2
    FROM '@STAGE/file1/' t
    ) file_format = (format_name = file_format_name);

Being file1 a CSV file in a S3 bucket.

The thing is that maybe a bad character could come in any of the rows of the file such us a varchar in an integer defined column.

I added the option ON_ERROR = CONTINUE in the copy into statement to continue with the loading process but i need to obtain those rows that failed to load to do something with them.

I tried to use the "SELECT * FROM TABLE(VALIDATE(t1, JOB_ID => '_last'));" after the copy was made, but doesn't work. It's showing the following error message: "Invalid argument [We couldn't find a copy for this table which occurred during this session ] for table function. Table function argument is required to be a constant."

Any thoughts on who to obtain the rows that failed?

I was thinking to run a stored procedure after this copy into finishes comparing through the primary key to obtain thous rows that wasn't loaded but i would like a better solution.

Thanks

  • Hi - what do you mean when you say the validate option doesn’t work? Does it error/give an error message, does it not return any data, does it not return the data you want? – NickW May 30 '23 at 22:41
  • Hello @NickW when i execute the "SELECT * FROM TABLE(VALIDATE ..." gives me this error: "Invalid argument [We couldn't find a copy for this table which occurred during this session ] for table function. Table function argument is required to be a constant." . I'm assuming that the error occurs because i'm doing transformations in the copy and i think based on the documentation that de validate only works when u don't use transformations – Robertino Bonora May 30 '23 at 22:50
  • Yes - if you are doing transformations then it won't work, though the sample SQL you gave in your question doesn't include transformations. In your VALIDATE SQL did you use the query_id, rather than _last, just to be sure it is picking up the correct COPY statement? If this doesn't work then I'm not sure what else to try - this may be a question for Snowflake Support – NickW May 31 '23 at 08:58
  • Hello @NickW, yeah in this example i didn't use any transformations but is usual for us to use most of the time trims and replaces. I tried to use the '_last' option of the VALIDATE function, u think i should use the specific ID? – Robertino Bonora May 31 '23 at 12:42
  • @NickW I think currently isn't a native solution in snowflake to handle this use case. So i was thinking to do a "common copy into" without transformations and use the on_error = continue, after this trigger somehow a task that does the VALIDATE and register the errors in another table and in parallel trigger another task that takes the "good rows" with a merge into the final table – Robertino Bonora May 31 '23 at 12:44
  • It probably makes no difference, but I'd use the query id, rather than relying on "_last", just to be 100% sure I'm looking at the right query/copy statement – NickW May 31 '23 at 12:45
  • Raise this with Snowflake Support. They're normally pretty good at coming back with suggestions or acknowledging that it is a gap in their functionality – NickW May 31 '23 at 12:46
  • Yeah i post in community.snowflake.com pretty much the same question and someone replies this: "Validation fails if SELECT statements are used to transform data during a COPY INTO operation. This is right now a limitation and so the hard work around is to get all the columns to snowflake via COPY into
    FROM to capture all the error records." Pretty much the same thing i was thinking
    – Robertino Bonora May 31 '23 at 12:59

0 Answers0