0

I have a copy statement that unload from Snowflake to S3.

COPY INTO @MY_STAGE/path/my_filename FROM (
SELECT OBJECT_CONSTRUCT(*) from my_table) 
FILE_FORMAT =(TYPE = JSON COMPRESSION = NONE)
OVERWRITE=TRUE;

Current result in myfilename.json:

  {
   "a": 123,
   "b": "def"
  }

  {
   "a": 456,
   "b": "ghi"
  }

Using OBJECT_CONSTRUCT() will produce in ndjson format. However, I want to save the file in array of json such as:

[ 
  {
   "a": 123,
   "b": "def"
  },
  {
   "a": 456,
   "b": "ghi"
  }
]

The data needs to be stored in one single file. I know I need to do something with SELECT OBJECT_CONSTRUCT(*) from my_table, however I'm not sure how to transform this.

user6308605
  • 693
  • 8
  • 26
  • 1
    It is probably easier to just use something like Python to convert an ndjson to json, see this [example](https://stackoverflow.com/questions/67736164/convert-ndjson-to-json-in-python). – Sergiu Nov 14 '22 at 13:51

1 Answers1

0

You can use ARRAY_AGG to achieve this:

COPY INTO @MY_STAGE/path/my_filename FROM (
SELECT ARRAY_AGG(OBJECT_CONSTRUCT(*)) from my_table) 
FILE_FORMAT =(TYPE = JSON COMPRESSION = NONE)
OVERWRITE=TRUE;

This function converts your record set into an array. Since Snowflake arrays are basically JSON arrays, the array returned by ARRAY_AGG can be written directly into the JSON file.

Unfortunately, ARRAY_AGG has a limitation of being able to hold only 16 MB of data. If your dataset is larger than this, it's unfortunately not possible to write all of it into a file in your desired format.

user3738870
  • 1,415
  • 2
  • 12
  • 24