I know I can easily use the AWS Glue console to do this, but I am just trying to do it through the AWS CLI instead. So I have an my_table_name
table with an id
column that is currently type string
. However, I would like to change the type to bigint
.
My current attempt at it is the code below. First, I get tableinput
from get-table
and change the 3rd column (id
) to bigint
. Then, I update the glue table with the modified tableinput
as such:
#!/bin/bash
tableinput=$( aws glue get-table \
--database-name $databasename \
--name $tablename \
| json Table \
| json -e "this.StorageDescriptor.Columns[2].Type='bigint'" )
aws glue update-table \
--database-name $databasename \
--name $tablename \
--table-input $tableinput
For reference, echo tableinput
gets me this JSON:
{ "Name": "my_table_name", "DatabaseName": "my_database_name", "CreateTime": "my_date", "UpdateTime": "my_date", "Retention": 0, "StorageDescriptor": { "Columns": [ { "Name": "kind", "Type": "string" }, { "Name": "etag", "Type": "string" }, { "Name": "id", "Type": "bigint" }, { "Name": "snippet_channelid", "Type": "string" }, { "Name": "snippet_title", "Type": "string" }, { "Name": "snippet_assignable", "Type": "boolean" } ], "Location": "my_location", "InputFormat": "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat", "OutputFormat": "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat", "Compressed": true, "NumberOfBuckets": -1, "SerdeInfo": { "SerializationLibrary": "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe", "Parameters": { "serialization.format": "1" } }, "BucketColumns": [], "SortColumns": [], "Parameters": { "CrawlerSchemaDeserializerVersion": "1.0", "classification": "parquet", "compressionType": "snappy", "typeOfData": "file" }, "StoredAsSubDirectories": false }, "PartitionKeys": [], "TableType": "EXTERNAL_TABLE", "Parameters": { "classification": "parquet", "compressionType": "snappy", "projection.enabled": "false", "typeOfData": "file" }, "CreatedBy": "my_role", "IsRegisteredWithLakeFormation": false, "CatalogId": "my_catalog_id", "VersionId": "0" }
However, I am getting this error:
Unknown options: --name, "Name":, "my_table_name",, "DatabaseName":, "my_database_name",, "CreateTime":, "my_date",, "UpdateTime":, "my_date",, "Retention":, 0,, "StorageDescriptor":, {, "Columns":, [, {, "Name":, "kind",, "Type":, "string", },, {, "Name":, "etag",, "Type":, "string", },, {, "Name":, "id",, "Type":, "bigint", },, {, "Name":, "snippet_channelid",, "Type":, "string", },, {, "Name":, "snippet_title",, "Type":, "string", },, {, "Name":, "snippet_assignable",, "Type":, "boolean", }, ],, "Location":, "s3://my_location",, "InputFormat":, "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat",, "OutputFormat":, "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat",, "Compressed":, true,, "NumberOfBuckets":, -1,, "SerdeInfo":, {, "SerializationLibrary":, "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe",, "Parameters":, {, "serialization.format":, "1", }, },, "BucketColumns":, [],, "SortColumns":, [],, "Parameters":, {, "CrawlerSchemaDeserializerVersion":, "1.0",, "classification":, "parquet",, "compressionType":, "snappy",, "typeOfData":, "file", },, "StoredAsSubDirectories":, false, },, "PartitionKeys":, [],, "TableType":, "EXTERNAL_TABLE",, "Parameters":, {, "classification":, "parquet",, "compressionType":, "snappy",, "projection.enabled":, "false",, "typeOfData":, "file", },, "CreatedBy":, "my_role",, "IsRegisteredWithLakeFormation":, false,, "CatalogId":, "my_catalog_id",, "VersionId":, "0", }, my_table_name
Removing the --name
option from update-table
gets me aws.exe: error: the following arguments are required: --name