4

I'm working with the clickhouse docker image on windows docker desktop 10:

https://hub.docker.com/r/yandex/clickhouse-server/

I have got the container up and running and am loading in data. I am running into this issue where CH complains about expecting a comma before line xyz, but I know for a fact after opening the file in notepad ++ that there is in fact a comma where it is supposed to be:

Code: 27. DB::Exception: Cannot parse input: expected , before . . . 

Or there will be an issue with regards to end of line:

Code: 117. DB::Exception: Expected end of line: (at row 127249)

It also complains:

Could not print diagnostic info because two last rows aren't in buffer (rare case)

I've noticed for relatively small files I get no problem (less than 30k rows). But larger files are a problem. I have tested these files before so I know they are good and loadable. This seems to be an issue for clickhouse in the image, because it can't even print out a diagnostic. Any ideas what might be the issue?

EDIT: Example

Below with this data I get one of the errors mentioned above. I use R to script out a 1000,000 row file to load:

#generate my data-----------------------------------------------------------
library(data.table)
set.seed(22)
u = runif(1000000, 0, 60) # "noise" to add or subtract from some timepoint
x = runif(1000000, 0, 1)

my_table = 
data.table(
  pudt=as.POSIXct(u, origin = "2017-02-03 08:00:00"),
  count = round(x,2)
)

my_table[
  ,pudt:=as.character(pudt)]

#write out--------------------------------------
fwrite(my_table, "my_data.csv", row.names = F, col.names = F)


#create my table in clickhouse client 
CREATE TABLE test(
  pudt DateTime,
  count Float32
)engine = Log;


#load the data in powershell-----
$files = Get-ChildItem "where my files are . . . "

foreach ($f in $files){
  $outfile = $f.FullName | Write-Host
  Get-Date | Write-Host    
  "Start loading" + $f.FullName | Write-Host
  `cat $f.FullName | docker run -i --rm --link some-clickhouse-server:clickhouse-client yandex/clickhouse-client -m --host some-clickhouse-server --query="INSERT INTO test FORMAT CSV"`
  "End loading" + $f.FullName | Write-Host
  [GC]::Collect()
}

the error i get here is:

    Code: 117. DB::Exception: Expected end of line: (at row 144020)
Could not print diagnostic info because two last rows aren't in buffer (rare case)

I checked the file and don't really see an issue I am aware of:

enter image description here

LoF10
  • 1,907
  • 1
  • 23
  • 64
  • 1
    Recently, I faced a similar issue. Upgrading to latest stable version helped, fix it(`19.13.3.26`). https://github.com/yandex/ClickHouse/issues/6426 – Pramit Sep 02 '19 at 22:19

1 Answers1

0

Seems this is an official bug from CH, I will test and see:

https://groups.google.com/forum/#!topic/clickhouse/Ofbtz5B7_Fw

UPDATE:

Solved the issue by building a custom clickhouse-client image on 13.9. Works perfectly now.

LoF10
  • 1,907
  • 1
  • 23
  • 64