0

I'm pushing data from a RDS MySQL to S3. My S3 target endpoint have this settings:

{
  "CsvRowDelimiter": "\\n",
  "CsvDelimiter": ",",
  "AddColumnName": true,
  "CompressionType": "NONE",
  "EnableStatistics": true,
  "DatePartitionEnabled": true,
  "DatePartitionSequence": "YYYYMMDD",
  "DatePartitionDelimiter": "SLASH",
  "EncryptionMode": "SSE_KMS",
  "ServerSideEncryptionKmsKeyId": "XXXXX",
  "TimestampColumnName":"TIMESTAMP",
  "IncludeOpForFullLoad": true,
  "CdcInsertsOnly": false
}

At the end my folder structure is looking like this:

-/
--/my_table_name
---LOAD00000001.csv
---/2023
----/01
-----/09
------20230109-165206632.csv

I have two questions regarding settings without finding the answer:

  • Can we setup some prefix on the date folders? I want to follow Hive naming convention (.eg. /year=2023/month=01/day=09/)
  • Is this possible to have the initial load csv file not located at the root, but inside the date folders structure?
alxsbn
  • 340
  • 2
  • 14

0 Answers0