I'm new to AWS Glue and PySpark. Below is a code sample
glue_context.create_dynamic_frame.from_catalog(
database = "my_S3_data_set",
table_name = "catalog_data_table",
push_down_predicate = my_partition_predicate)
in the guide Managing Partitions for ETL Output in AWS Glue.
Suppose a SQL query to filter the data frame is as below
select * from catalog_data_table
where timestamp >= '2018-1-1'
How to do the pre-filtering on AWS Glue?