1

I have a table with about 30k rows and each of them is put in {}

in the end I would like to get it like this:

[
{Objekt1},
{Objekt2}
]

This solution worked well, as we haven't had that many rows. But now we get this limit.

COPY INTO   FROM ( 
SELECT array_agg(*)  FROM  ( 
SELECT   OBJECT_CONSTRUCT( ......   
OBJECT_CONSTRUCT(.....)  )  
from 
(select * from (select  
 REPLACE(parse_json(OFFER):"spec":"im:offerID",'"')::varchar AS ID, 
...,
... )))) )    
FILE_FORMAT = (TYPE = JSON    COMPRESSION = None )   
credentials = 
(aws_key_id=''aws_secret_key='')   
OVERWRITE = TRUE   single = true   
HEADER = FALSE   
max_file_size=267772160

We offer this to some external agency and that style is the only way, they can read it.

Is there another solution? Or a way to go around this problem?

Thanks

Instamo
  • 45
  • 2
  • 9
  • Hi, please explain what you are actually doing. Are you writing the result to a file, to a table, or something else? Please provide a minimal reproducible example of your process – NickW Jan 26 '21 at 13:44
  • So I added the used code to the upper part. Wer are compressing data and push it with a copy command to a s3 bucket. There the agency takes the date and uses it – Instamo Jan 26 '21 at 14:18

1 Answers1

2

As you've discovered, there is a hard limit of 16Mb on array_agg (and in a lot of other places in Snowflake e.g. it's the max size for a variant column).

If it is acceptable to create multiple files then you can probably achieve this in a Stored Proc - find some combination of column values that will guarantee that the data in each partition will result in an array_agg size < 16Mb - and then loop through those partitions running a COPY INTO for each one and outputting to a different file each time.

If you have to produce a single file then I can't think of a way of achieving this in Snowflake (though someone else may be able to). If you can process the file once it is written to S3 then it would be straightforward to copy the data to a file as JSON and then edit it to add the '[' and ']' around it

NickW
  • 8,430
  • 2
  • 6
  • 19
  • Jap sure, I saw that there is a size limit. And propably the only way is to split the data. Was hoping for any othes solution, as my knowledge is not the highest in that topic. Thanks NickW for helping me out – Instamo Jan 27 '21 at 07:20