I have a text file which look like below.
HDR¶20200101
BDY¶1¶Jimmy
BDY¶1¶Something
TRL¶123
I would like to parse it to a Glue Dynamic Dataframe by filtering out the header trailer. Also assign the header as ID, Name. I tried the below code and it doesn't seem to work.
dyf_test = glueContext.create_dynamic_frame.from_options(
format_options={"withHeader": False, "separator": "¶"},
connection_type="s3",
format="csv",
connection_options={
"paths": [
"s3://Files/test.gz"
],
"recurse": True,
})
dyf_test = Filter.apply(
frame=dyf_test,
f=lambda row: (
bool(re.match("HDR", row[0]))
and bool(re.match("TRL", row[0]))
)
)
Error : com.amazonaws.services.glue.util.FatalException: Unable to parse file: test.gz