could somebody help me with these. I got very big files (csv format with 5 columns) aprox 500Mb-1Gb wich i need to insert in greenplum database. I use source file to read these files with option --mode=lines and sink gpfdist to import these data in greenplum but speed of this operation is very very poor. How can i tune this ?? i try channging options batchcount flushcount flushtime batchtime and etc but without luck. With gpload it only takes ~20-30sec to insert file ~800Mb.
file --directory=/data --filename-pattern=*.csv --mode=lines --prevent-duplicates=false --markers-json=false | gpfdist --db-user=**** --db-name=**** --column-delimiter=, --mode=insert --gpfdist-port=8000 --db-password=**** --db-host=**** --table=test --flush-count=200 --batch-count=1000000 --batch-period=2
tnx