1

I'm using terraform in order to configure this DMS migration task that migrates (full-load+cdc) the data from a MySQL instance to a S3 bucket.

The problem is that the configuration seems not to take effect and no partition-folder is created. All the migrated files are created in the same directory within the bucket.

In the documentation they say the endpoint s3 setting DatePartitionEnabled, introduced in the version 3.4.2, is supported both for CDC and FullLoad+CDC.

My terraform configuration spec:

resource "aws_dms_endpoint" "example" {
  endpoint_id                 = "example"
  endpoint_type               = "target"
  engine_name                 = "s3"

  s3_settings {
    bucket_name = "example"
    bucket_folder = "example-folder"
    compression_type = "GZIP"
    data_format = "parquet"
    parquet_version = "parquet-2-0"
    service_access_role_arn = var.service_access_role_arn
    date_partition_enabled = true
  }

  tags = {
    Name = "example"
  }
}

But in the respective s3 bucket I get no folders, but sequential files as if this option wasn't there.

LOAD00000001.parquet
LOAD00000002.parquet
...

I'm using terraform 1.0.7, aws provider 3.66.0 and a DMS Replication Instance 3.4.6.

Does anyone know what could be this issue?

  • I just realized after the full load process finished that the partitioning only works for ongoing replications (cdc). First DMS creates all the parquet files in the root directory for the load, then it creates the respective partitions for CDC. The problem is that I need all the data partitioned so I had to move this manually to the first partition created by DMS. I'm going to use a glue crawler + athena to generate these missing partitions. – captainslocum Jan 05 '22 at 14:14

0 Answers0