I am using amazon redshift with R. I got the connection working. I wanted to create some database tables and insert some data into them through R. However, I ran into some issues while doing so
# Set the Java Environment
# For 64 bit connections
Sys.setenv(JAVA_HOME='C:\\Program Files\\Java\\jre1.8.0_121')
# For 32 bit connections
#Sys.setenv(JAVA_HOME='C:\\Program Files (x86)\\Java\\jre1.8.0_121')
# Load libraries if they haven't already
library(rJava)
library(RJDBC)
# Get the driver
jdbcDriver <- JDBC(driverClass="com.amazon.redshift.jdbc41.Driver",
classPath="drivers/RedshiftJDBC41-1.2.1.1001.jar")
# Connect to the database
jdbcConnection <- dbConnect(jdbcDriver,
"server",
"username", "password")
The database is connected, without issues. Now I wanted to create a table and insert data into it. I was using sqlCreateTable
and sqlAppendTable
to create and insert data into the table.
# Create a table iris, and insert the data into it
sql_required <- sqlCreateTable(jdbcConnection, "iris_table1", iris)
tbl_create <- dbGetQuery(jdbcConnection, sql_required)
However, the above command results in the following error.
# Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", :
# Unable to retrieve JDBC result set for CREATE TABLE "iris_table1" (
# "Sepal.Length" DOUBLE PRECISION,
# "Sepal.Width" DOUBLE PRECISION,
# "Petal.Length" DOUBLE PRECISION,
# "Petal.Width" DOUBLE PRECISION,
# "Species" VARCHAR(255)
# )
# ([JDBC Driver]com.amazon.dsi.dataengine.impl.DSISimpleRowCountResult
# cannot be cast to com.amazon.dsi.dataengine.interfaces.IResultSet)
Nevertheless, the table is created and will be listed by the following command
sql_required <- "SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE' AND TABLE_CATALOG=\'demodata\'"
tables_in_reshift <- dbGetQuery(jdbcConnection, sql_required)
tail(tables_in_reshift, 1)
# table_name
# 141 iris_table1
I can even insert the data
sql_required <- sqlAppendTable(jdbcConnection, "iris_table1", head(iris))
tbl_insert <- dbGetQuery(jdbcConnection, sql_required)
However, this will also result in following error.
# Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", :
# Unable to retrieve JDBC result set for INSERT INTO "iris_table1"
# ("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width", "Species")
# VALUES
# (5.1, 3.5, 1.4, 0.2, 'setosa'),
# (4.9, 3, 1.4, 0.2, 'setosa'),
# (4.7, 3.2, 1.3, 0.2, 'setosa'),
# (4.6, 3.1, 1.5, 0.2, 'setosa'),
# (5, 3.6, 1.4, 0.2, 'setosa'),
# (5.4, 3.9, 1.7, 0.4, 'setosa')
# ([JDBC Driver]com.amazon.dsi.dataengine.impl.DSISimpleRowCountResult
# cannot be cast to com.amazon.dsi.dataengine.interfaces.IResultSet)
#
Nevertheless, the data will be inserted, as the following command returns 6 records
sql_required <- "SELECT * from iris_table1"
inserted_dat <- dbGetQuery(jdbcConnection, sql_required)
# sepal.length sepal.width petal.length petal.width species
# 1 4.9 3.0 1.4 0.2 setosa
# 2 4.6 3.1 1.5 0.2 setosa
# 3 5.4 3.9 1.7 0.4 setosa
# 4 5.1 3.5 1.4 0.2 setosa
# 5 4.7 3.2 1.3 0.2 setosa
# 6 5.0 3.6 1.4 0.2 setosa
Although the program works, I am concerned with the errors. So, my questions are:
What is the error trying to explain and how do I get rid of it?
Is there a better way (faster, more efficient) to create a table and insert data into redshift using RJDBC
? Thanks!
UPDATE I was of the initial opinion that running command such as
sqlCreateTable(jdbcConnection, "iris_table5", iris)
should be enough to create table but it doesn't create table neither does
sqlAppendTable(jdbcConnection, "iris_table1", head(iris))
append data to the database table.