I would like to convert a monthly feed to convert from csv to pipe delimited using AWS Glue Crawler. Is it possible to create a classifier which can convert csv file to pipe delimited (Using Grok or something) and monthly scheduled crawler can create the Glue catalog
Asked
Active
Viewed 927 times
1 Answers
0
Glue Crawler is used for populating the AWS Glue Data Catalog with tables so you cannot convert your file from csv format to pipe delimited by using only this functionality. Right steps should be like this:
- Creating two tables in Glue Data Catalog. One for file in CSV format, and one for pipe delimited format. To catalog the source table, you can use Glue Crawler.
- Creating glue job to transfer data between these tables.
This article does not refer exactly to your problem, but you can see how these mentioned steps should look:
https://aws.amazon.com/blogs/big-data/build-a-data-lake-foundation-with-aws-glue-and-amazon-s3/
You have also tutorials in Glue console (at the bottom in the left menu)

jbgorski
- 1,824
- 9
- 16