0

I have Glue DBs(db1 and db2) and tables(tbl1 and tbl2) available in different AWS regions(eu-west-1 and us-east-1) respectively.

My glue job in eu-west-1, needs data from both the tables, just a simple select * from db1.tbl1 and select * from db2.tbl2. Data is stored in AWS S3 as parquet and am able to query via Athena too.

How can I retrieve that data via spark sql in glue job. Can you help me out with an example? If not spark sql can you please suggest a different approach?

Thanks very much!

1 Answers1

0

Create a crawler in EU region to read data from US region S3 bucket, this would create a table in EU DB(S3 location points to US S3 bucket). That way the data is in US region but your glue job in EU can retrieve US data as required.

  • 1
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Apr 22 '22 at 18:49