3

In AWS S3 I have json docs that I read-in with AWS Glue's create_dynamic_frame.from_options("s3" ...) and the DynamicFrame.printSchema() shows me this, which matches the schema of the documents:

root
|-- updatedAt: string
|-- json: struct
|    |-- rowId: int

Then I unnest() or relationalize() (have tried both) the DynamicFrame to a new dyF and then .printSchema() shows me this, which seems correctly unnested:

root
|-- updatedAt: string
|-- json.rowId: int

The problem is that I can't seem to use the nested fields.
dyF.select_fields(["updatedAt"]) will work and give me a dyF with the "updatedAt" field.
But
dyF.select_fields(["json.rowId"]) gives me an empty dyF.

What am I doing wrong?

user12166
  • 51
  • 5

1 Answers1

2

The solution is to use backticks around the column name.

Example: .select_fields(["journalId", "`json.rowId`"])

user12166
  • 51
  • 5