0

When trying to copy data from source (MSSQLSERVER) TO target (greenplum database) using talend ETL server.

Description: When executing an UPDATE statement to GreenPlum, the mentioned error is thrown.

GIVEN

No of records fetching to target is ~ 0.3 million

Update is failing with error

 ERROR: CANNOT PARALLELIZE AN UPDATE STATEMENT THAT UPDATES THE DISTRIBUTION COLUMNS current transaction is aborted, commands ignored until end of transaction block

Any help on it would be much appreciated

Solution i tried :

When ON_ERROR_ROLLBACK is enabled, psql will issue a SAVEPOINT before every command you send to greenplum

gpadmin=# \set ON_ERROR_ROLLBACK interactive

But after that we tried running the same Job and it did not solved the problem.

enter image description here

Community
  • 1
  • 1
NEO
  • 389
  • 8
  • 31

1 Answers1

0

1) Update is not supported in Hawq. 2) Update is only supported to heap but not AO table in GPDB. GPDB/HAWQ are used as data warehouse/BI and data exploration purpose.

Sung Yu-wei
  • 161
  • 8
  • In GPDB When i am using Heap storage syntax like this CREATE TABLE foo (a int, b text) DISTRIBUTED BY (a); – NEO Apr 28 '17 at 14:58
  • Also Job Finishes with Hang then after some times it is throwing the error – NEO Apr 28 '17 at 16:25
  • 1
    you can't update distribution key. it is how MPP works. – Sung Yu-wei Apr 28 '17 at 19:53
  • 1
    Update is supported on AO tables, since those are Append Optimized :) – Balazs Gunics May 01 '17 at 13:58
  • @BalazsGunics so table storage should be AO instead heap.? then update will work on it – NEO May 01 '17 at 17:21
  • I have to update ~0.3 million rows on each job run. when i am trying to execute the job . No rows is updating due to ERROR: CANNOT PARALLELIZE AN UPDATE STATEMENT THAT UPDATES THE DISTRIBUTION COLUMNS – NEO May 01 '17 at 17:35
  • distribution key is used to decide which segment should store your table rows. It doesn't make any sense to "update" distribution key. Why would you change distribution key? are you trying to shuffling data? – Sung Yu-wei May 01 '17 at 17:39
  • salesorder is primary key as well as distribution key. For update, schema should have key. – NEO May 01 '17 at 17:57