3

I have a geometry column of a geodataframe populated with polygons and I need to upload these to Snowflake.

I have been exporting the geometry column of the geodataframe to file and have tried both CSV and GeoJSON formats, but so far I either always get an error the staging table always winds up empty.

Here's my code:

design_gdf['geometry'].to_csv('polygons.csv', index=False, header=False, sep='|', compression=None)

import sqlalchemy
from sqlalchemy import create_engine
from snowflake.sqlalchemy import URL

engine = create_engine(
    URL(<Snowflake Credentials Here>)
)

with engine.connect() as con:
    con.execute("PUT file://<path to polygons.csv> @~ AUTO_COMPRESS=FALSE")

Then on Snowflake I run

create or replace table DB.SCHEMA.DESIGN_POLYGONS_STAGING (geometry GEOGRAPHY);

copy into DB.SCHEMA."DESIGN_POLYGONS_STAGING"
from @~/polygons.csv
FILE_FORMAT = (TYPE = CSV FIELD_DELIMITER = '|' SKIP_HEADER = 1 compression = None encoding = 'iso-8859-1');

Generates the following error:

"Number of columns in file (6) does not match that of the corresponding table (1), use file format option error_on_column_count_mismatch=false to ignore this error File '@~/polygons.csv.gz', line 3, character 1 Row 1 starts at line 2, column "DESIGN_POLYGONS_STAGING"[6] If you would like to continue loading when an error is encountered, use other values such as 'SKIP_FILE' or 'CONTINUE' for the ON_ERROR option. For more information on loading options, please run 'info loading_data' in a SQL client."

Can anyone identify what I'm doing wrong?

Keith
  • 1,777
  • 3
  • 15
  • 20
  • 1
    can you post a line from the file? Also if you try manually insert a row with the same shaped data as in that line, is it accepted by the insert, or does it need transformations? – Simeon Pilgrim Feb 17 '22 at 01:18
  • That's a great idea. I'll try the experiment and get back to you. – Keith Feb 17 '22 at 04:46
  • Whoops, my Polygon specification had syntax errors. This worked: create or replace table DB.SCHEMA."DESIGN_POLYGONS_STAGING" ( geometry GEOGRAPHY ) AS SELECT TO_GEOGRAPHY('POLYGON((0.0 0.0, 1.0 0.0, 1.0 1.0, 0.0 1.0, 0.0 0.0))') – Keith Feb 17 '22 at 05:53
  • In a previous experiment I had uploaded a polygons.csv.gz. Snowflake can be very dumb and was automatically inserting this file treating it as uncompressed and then complaining that it wasn't UTF-8. Deleting this file from the stage solved several problems resulting from this behavior. – Keith Feb 17 '22 at 06:33
  • So all solved? Or are then current road blocks? – Simeon Pilgrim Feb 17 '22 at 06:52
  • @SimeonPilgrim Thank you so much! Your comment got me on the right path. – Keith Feb 17 '22 at 21:42

1 Answers1

2

Inspired by @Simeon_Pilgrim's comment I went back to Snowflake's documentation. There I found an example of converting a string literal to a GEOGRAPHY.

https://docs.snowflake.com/en/sql-reference/functions/to_geography.html#examples

select to_geography('POINT(-122.35 37.55)');

My polygons looked like strings describing Polygons more than actual GEOGRAPHYs so I decided I needed to be treating them as strings and then calling TO_GEOGRAPHY() on them.

I quickly discovered that they needed to be explicitly enclosed in single quotes and copied into a VARCHAR column in the staging table. This was accomplished by modifying the CSV export code:

import csv
design_gdf['geometry'].to_csv(<path to polygons.csv>, 
                          index=False, header=False, sep='|', compression=None, quoting=csv.QUOTE_ALL, quotechar="'")

The staging table now looks like:

create or replace table DB.SCHEMA."DESIGN_POLYGONS_STAGING" (geometry VARCHAR);

I ran into further problems copying into the staging table related to the presence of a polygons.csv.gz file I must have uploaded in a previous experiment. I deleted this file using:

remove @~/polygons.csv.gz

Finally, converting the staging table to GEOGRAPHY

create or replace table DB.SCHEMA."DESIGN_GEOGRAPHY_STAGING" (geometry GEOGRAPHY);

insert into DB.SCHEMA."DESIGN_GEOGRAPHY"
select to_geography(geometry)
from DB.SCHEMA."DESIGN_POLYGONS_STAGING"

and I wound up with a DESIGN_GEOGRAPHY table with a single column of GEOGRAPHYs in it. Success!!!

Keith
  • 1,777
  • 3
  • 15
  • 20