0

I'm trying to make a complete copy of a Snowflake DB into PostgreSQL DB (every table/view, every row). I don't know the best way to go about accomplishing this. I've tried using a package called pipelinewise , but I could not get the access needed to convert a snowflake view to a postgreSQL table (it needs a unique id). Long story short it just would not work for me.

I've now moved on to using the snowflake-sqlalchemy package. So, I'm wondering what is the best way to just make a complete copy of the entire DB. Is it necessary to make a model for each table, because this is a big DB? I'm new to SQL alchemy in general, so I don't know exactly where to start. My guess is with reflections , but when I try the example below I'm not getting any results.

from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine, MetaData

engine = create_engine(URL(
account="xxxx",
user="xxxx",
password="xxxx",
database="xxxxx",
schema="xxxxx",
warehouse="xxxx"
))
engine.connect()
metadata = MetData(bind=engine)

for t in metadata.sorted_tables:
    print(t.name)

I'm pretty sure the issue is not the engine because I did do the validate.py example and it does return the version like expected. Any advice to why my code above is not working, or a better way to accomplish my goal of making a complete copy of the DB would be greatly appreciated.

PPS
  • 69
  • 1
  • 8
  • I am doing the correct imports for `URL` and `create_engine` `from snowflake.sqlalchemy import URL` and `from sqlalchemy import create_engine` – PPS Jun 14 '21 at 15:42
  • Is this a 1-time thing or will you need to keep Postgres in sync with Snowflake? – Mike Walton Jun 14 '21 at 15:58
  • Hey @MikeWalton, Snyc is not necessary , I don't mind running a script everyday if needed. In fact, the Snowflake DB gets updated once a day via a datadump itself, so there is no real time updates to consider. – PPS Jun 14 '21 at 16:05
  • I ask, because you might find something like FiveTran or other replication services to be a lot easier than scripting everything out. – Mike Walton Jun 14 '21 at 16:17
  • I understand, and that is why I attempted to use the pipelinewise package at first (singer wrapper). But, did not work for me. However, I don't have the funds for paid services like fivetran at the moment. – PPS Jun 14 '21 at 16:21
  • 1
    Ok - to your actual question, have you looked into the inspector API collection in the Snowflake SQLAlchemy library? https://docs.snowflake.com/en/user-guide/sqlalchemy.html#cache-column-metadata – Mike Walton Jun 14 '21 at 16:31
  • That looks like what I need , ty for the tip. However, I'm still kind of stuck at square one with connecting... does the code above look right, to you? I'm not sure where the issues lies, not getting any errors or anything. No worries if you are not interested in the code part of the issue, your input was very helpful and I will probably use the inspector API was I get up and going. – PPS Jun 14 '21 at 17:03
  • Without seeing the values for your entries or an actual error message, it's tough to diagnose. What is happening? I usually just execute a `SELECT current_version();` command after I create the engine connection to test it out. You'll get some message that way. – Mike Walton Jun 14 '21 at 17:26
  • I did that as well, and I get the version number as expected. So I think the engine credentials are correct. However when I try to run the above code I don't get any errors (or results). This is the example ( https://docs.sqlalchemy.org/en/14/core/metadata.html#accessing-tables-and-columns ) I'm trying to run. How would you go about doing this? – PPS Jun 14 '21 at 17:36
  • 1
    Snowflake's SQLAlchemy might not have that function, perhaps. Not sure. Try the inspector functions, instead, and see if that is better. – Mike Walton Jun 14 '21 at 17:46
  • I tried running this `# added import for inspect # using the same engine inspector = inspect(engine) schema = inspector.default_schema_names for table_name in inspector.get_table_names(schema): print(table_name)` I may be missing something that you think should be a given. I'm pretty new to python, and this is my first time using Snowflake or SqlAlchemy, so take that into consideration. Are there any complete examples somewhere? – PPS Jun 14 '21 at 18:09
  • Sorry for the mess, can't figure out how to put block code in comments. – PPS Jun 14 '21 at 18:17

1 Answers1

1

Try this: I got it working on mine, but I have a few functions that I use for my sqlalchemy engine, so might not work as is:

from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine, MetaData
import sqlalchemy sa

engine = sa.create_engine(URL(
account="xxxx",
user="xxxx",
password="xxxx",
database="xxxxx",
schema="xxxxx",
warehouse="xxxx"
))

inspector = sa.inspect(engine)
schema = inspector.default_schema_names 
for table_name in inspector.get_table_names(schema):
     print(table_name)
Mike Walton
  • 6,595
  • 2
  • 11
  • 22