0

I have a plv8 function (but I assume the same thing applies to any language). It could be called a lot in one statement (select say). It computes expensive stuff so I dont want to recompute every time. But the stuff it computes depends on the contents of the database. I can naively cache things in the function but that will never get cleared. So after the first call I am always operating on old data. If would be nice to flush at the start of each statement execution.

Note that triggering off changes to the tables the cache depends on doesnt work. The cache exists in connectionA, the DB can be changed by connectionB (or C,...)

pm100
  • 48,078
  • 23
  • 82
  • 145
  • Cache things in a real (i.e. connection independent) table (or materialized view) and just read it in the function. Update the cache on database update or periodically or whenver it fits you. – freakish May 09 '18 at 17:55
  • just feels like the db round trips will be expensive compared to have things directly in memory – pm100 May 09 '18 at 18:17
  • @freakish measuring shows that a 10k table select count(*) using this function as a where clause takes 1 second if I include 1 prepared select on primary key (ie a DB cache), vs instant if i remove the select execution. – pm100 May 09 '18 at 18:34
  • You need a proper indexes. Then selects will be blazing fast. 10k is like nothing. It would be better, of course, to implement everything outside the db. You would have more options, e.g. redis. – freakish May 09 '18 at 19:11
  • You should declare the function as `STABLE` so that the database knows that it will always return the same result when called with the same arguments in a single query. – Laurenz Albe May 09 '18 at 20:16
  • @freakish - select was on primary key. 10k taking 1 second says that each one takes 100ns, which is pretty dang fast, but still a non trivial overhead – pm100 May 10 '18 at 00:41
  • @freakish - using trusted plv8, cannot reach outside db engine (unless you kow better), and it would probably be slower – pm100 May 10 '18 at 00:43
  • @pm100 oh, alright, I misunderstood what you were saying. Anyway is performance the goal or scaling? If scaling then you can switch to some distributed environment, possibly nosql. If performance then I doubt you will get any better under your conditions. And sacrificing some performance for better scaling is a common thing to do. Assuming the process is parallelizable. – freakish May 10 '18 at 07:39

0 Answers0