I have created glue job to read the data from oracle by using below code.
WhereQuery="select * from test where dated>==CURRENT_DATE-4
connection_oracle11_options = {
"url": URL,
"dbtable": tableName,
"user": USERNAME,
"password": PASSWORD,
"query": WhereQuery,
"hashfield": "testID",
"hashpartitions": '100'
}
transaction_item_df = glueContext.create_dynamic_frame.from_options(connection_type="oracle", connection_options=connection_oracle11_options)
if i am using query option it is taking 8 hours and if I am not executing the query it is taking 45mins is query option is correct ?
my data size is 318049228 and I am using Worker type: G1.X and number of workers :100 and "hashpartitions": '100' it is taking 45mins and what is the relation ship between hashpartitions and no of workers?