0

I'm trying to create a Clickhouse DB and copy to a table the values from many files (all them equal column order).

But I'm not understanding how to deal with it.

I'm using RClickhouse which I don't know if has any difference compared to clickhouse-r.

library(RClickhouse)
library(DBI)
library(tidyverse)

eggnog_dir <- "/home/acpguedes/projects/sig_trans/data/eggnog/table/"
setwd(eggnog_dir)

myconn <- DBI::dbConnect(drv = RClickhouse::clickhouse())

mytables <- list.files(".") # all tables are in the same folder 

mysqltb <- db_create_table(con = myconn, table = 'eggnog')


lapply(mytables, function(x) {
  read_tsv(file = x, 
           col_names = c( #in my case the tables has no header
             'sequence',
             'model',
             'start',
             'end',
             'evalue',
             'cov',
             'qstart',
             'qend',
             'iteration',
             'score',
             'talilen',
             'qlen',
             'estart',
             'eend',
             'program'
             )
           ) %>% dbWriteTable(conn=myconn, value = ., name = "domains", append=TRUE) 
  }
    ) -> dt

Don't matter the columns itself, I just would like an example of how to create a table and load to it the content of many files( tables like tsv ou CSV or any delim).

I also was trying with dbpĺyr using copy_to() instead dbWriteTable.

Also, afterload all tables, should I do a 'commit' statement to save permanently the database to posterior accession from R or other platforms?

Thanks in advance.

Aureliano Guedes
  • 767
  • 6
  • 22
  • What is the problem with your present approach? Can you write a table (and then read it back in) outside of the `lapply`? Is the problem with appending inside the `lapply`? Are there any error messages? – Simon.S.A. Mar 10 '20 at 21:52
  • @Simon.S.A. thanks. I was able to write a table this way I did, but I wasn't able to read it outside R or after the restart a session. – Aureliano Guedes Mar 10 '20 at 22:01

1 Answers1

0

Given that you can not read the table outside R or after a restart, it sounds like the issue is committing to the database.

Try something like the following after the lapply:

my_commit_statement = "COMMIT"
dbExecute(myconn, my_commit_statement)

With the appropriate commit statement for your application.

The other (unlikely) possibility is that table creation, or writing, is temporary. Both dbCreateTable and dbWriteTable from the DBI package have an argument temporary with default value FALSE so this should not be the cause. But no harm in making this argument explicit, as one temporary = TRUE could keep your table from persisting. Also worth checking whether there is any difference between db_create_table that you are using and dbCreateTable from the DBI package.

Simon.S.A.
  • 6,240
  • 7
  • 22
  • 41
  • Thanks again. I'm getting this error `Error in select(conn@ptr, statement) : DB::Exception: Syntax error: failed at position 1: COMMIT. Expected one of: Query, SYSTEM, DETACH, DROP, ....` quiet long error. I'll try again an I'll update with the erros and other details – Aureliano Guedes Mar 10 '20 at 22:57
  • My first thought is that the commit statement is more complex than just `COMMIT`. I do not have the clickhouse background to advise you on this part of it - my experience is with dbplyr. – Simon.S.A. Mar 11 '20 at 00:49