How to create a table from a CSV?

Question

SnappyData v.0.5

I want to do something similar to loading parquet files as found in the QuickStart load scripts.

CREATE TABLE STAGING_AIRLINEREF USING parquet OPTIONS(path '../../quickstart/data/airportcodeParquetData');

But, I have CSV files instead of parquet files. I do not see either the "USING parquet" or a CSV version in any RowStore documentation, so I took a guess and this fails.

CREATE TABLE STAGING_ROADS USING csv OPTIONS(path 'roads.csv');

How can I create a table directly from a CSV file where the header row is the column names and the rest are loaded as data rows?

EDIT

OK. Following Spark-CSV syntax, I load this file and get zero rows or table.

"roadId","name"
"1","Road 1"
"2","Road 2"
"3","Road 3"
"4","Road 4"
"5","Road 5"
"6","Road 6"
"7","Road 7"
"8","Road 8"
"9","Road 9"
"10","Road 10"


snappy> run '/home/ubuntu/data/example/load_roads.sql';
snappy> SET SCHEMA A;
0 rows inserted/updated/deleted
snappy> DROP TABLE IF EXISTS STAGING_ROADS;
0 rows inserted/updated/deleted
snappy> CREATE TABLE STAGING_ROADS
(road_id string, name string)
USING com.databricks.spark.csv
OPTIONS(path '/home/ubuntu/data/example/roads.csv', header 'true');
0 rows inserted/updated/deleted

suranjan · Accepted Answer · 2016-07-25T21:18:35.447

4

You can use the the following way:

CREATE TABLE STAGING_ROADS USING com.databricks.spark.csv OPTIONS(path 'roads.csv', header "true");

edited Jul 25 '16 at 21:18

answered Jul 25 '16 at 20:54

suranjan

447
2
4

I tried this and added the EDIT above. It runs at least, but does not create a table, nor does it load any rows in my 10 row CSV file. – Jason Jul 25 '16 at 22:00
OK. I stand corrected. Looking at the reply "0 rows inserted/updated/deleted" is wrong. This output from the snappy-shell is deceiving because when I actually do a select * from staging_roads, I get ten rows back. It appears I was hoodwinked by the response message! – Jason Jul 25 '16 at 23:03

score 1 · Answer 2 · answered Jul 26 '16 at 00:36

1

yes, unfortunately, the shell displays the returned set from JDBC and can be misleading for DDL commands. Notice it says the same even for 'SET SCHEMA'. Added a new JIRA to track this issue - https://jira.snappydata.io/browse/SNAP-940.

answered Jul 26 '16 at 00:36

jagsr

535
2
6

How to create a table from a CSV?

EDIT

2 Answers2