-2

Pentaho behaves slowly on insert/update with PostgreSQL, which is because jdbc driver will not accept prepared statements.

After 2 million row Pentaho / PostgreSQL performance almost become 1 per 1 second. Post me an alternative solution with example.

Below is my Create Table script

CREATE TABLE fact_model
(
  model_id integer NOT NULL,
  source integer NOT NULL,
  group_sk integer NOT NULL,
  user_sk integer NOT NULL,
  order_kind character varying(255),
  model_no character varying(255),
  CONSTRAINT pf_fact_receipt PRIMARY KEY (source, model_id)
)
WITH (
  OIDS=TRUE
)

ALTER TABLE fact_model
  ADD CONSTRAINT pf_fact_receipt PRIMARY KEY(source, model_id);

Guys, it is crazy replies I am getting, My expected intention was to help other, There are various step we can try. like follows.

  • Insert / Update which is slow
  • Upsert after PostgreSQL after 9.5 using SQL step
  • Update then error handled to a table output when we expect higher Update
  • Insert then error handled to Update when we expect higher Insert
  • Merge join split insert and update direct to appropriate step which makes one single

I believe there are always Lot more to figure out beyond my experience which I believe this post will sort out., So the expectation is better input than stated above.

  • 1
    May we see the `CREATE TABLE` statements for your database, and the numbers of rows in each, and the indexes you have, and the queries that are slow? There is no way to sensibly answer this at present. – halfer Jun 02 '17 at 14:32
  • I've voted to close, but if this closes I am happy to vote to reopen if all the requested information is provided. I can see one item has been provided (though only one table), three more items of information to go. – halfer Jun 02 '17 at 22:40
  • 1
    "Post me an alternative solution with example." - you're at the wrong kind of board. This kind of request is only suitable for a "hire a free-lancer"-style board. – Erwin Bolwidt Jun 04 '17 at 04:08
  • 1
    Note: I don't think you need `WITH( OIDS=True)` , it is obsolete. And:please add the definitions for the other (associated) tables to the question. +your query. +cardinalities +DB-tuning. (the symptoms look like you are outgrowing buffer-memory, while the planner still chooses hash-joins) – wildplasser Jun 05 '17 at 20:53

2 Answers2

1

You also need the composite(primary) KEY in reversed order (assuming that its two elements are also FKs to dimension tables) This will cause a composite index to be created.


CREATE TABLE fact_model
( model_id integer NOT NULL     -- REFERENCES ...
  , source integer NOT NULL     -- REFERENCES ...
  , group_sk integer NOT NULL   -- REFERENCES ...
  , user_sk integer NOT NULL    -- REFERENCES ...
  , order_kind character varying(255)
  , model_no character varying(255)
  , CONSTRAINT pf_fact_receipt PRIMARY KEY (source, model_id)
  , CONSTRAINT pf_fact_receipt_rev UNIQUE (model_id, source) -- <<--HERE
  );

VACUUM ANALYZE fact_model;

\d fact_model
wildplasser
  • 43,142
  • 8
  • 66
  • 109
0

create a new index by (source, model_id), i achieved much better performance using this kettle step (insert/update) adding the proper index.

jacktrade
  • 3,125
  • 2
  • 36
  • 50