How to retrieve data from different AWS regions for my glue job?

Question

I have Glue DBs(db1 and db2) and tables(tbl1 and tbl2) available in different AWS regions(eu-west-1 and us-east-1) respectively.

My glue job in eu-west-1, needs data from both the tables, just a simple select * from db1.tbl1 and select * from db2.tbl2. Data is stored in AWS S3 as parquet and am able to query via Athena too.

How can I retrieve that data via spark sql in glue job. Can you help me out with an example? If not spark sql can you please suggest a different approach?

Thanks very much!

Sri Vidhya Pavani · Answer 1 · 2022-04-22T18:49:34.317

0

Create a crawler in EU region to read data from US region S3 bucket, this would create a table in EU DB(S3 location points to US S3 bucket). That way the data is in US region but your glue job in EU can retrieve US data as required.

edited Apr 22 '22 at 18:49

answered Apr 22 '22 at 18:47

Sri Vidhya Pavani

43
8

1

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Apr 22 '22 at 18:49

How to retrieve data from different AWS regions for my glue job?

1 Answers1