14

I'm doing a research about logical decoding and I've been able to create a slot and replicate all transactions in a database to another using streaming replication protocol, and it works really well.

But I need to replicate just a single table and not all the tables on the database.

So, my question is: Does the logical decoding allows to filter the stream for a single table?

My current hint is to create a custom logical decoding output plugin, am I wrong?

Update

I've built an output plugin based on contrib/test decoding from postgresql sources and it was a good workaround. However it wasn't useful for real use cases, so I decided to take some other projects as references to fork and update.

The best for me was wal2json, so I decided to fork it and add the table filter as an option and not to hardcode the table names.

Here is the fork and this is the changeset.

How to use

First create the slot with the wal2json plugin:

pg_recvlogical -d postgres --slot test_slot --create-slot -P wal2json

Then start receiving the stream

pg_recvlogical -d postgres --slot test_slot --start -o limit-to=table_foo,table_bar -f -

Now we are ready to receive the updates on table_foo and table_bar only.


This was a really good challenge, I'm not a c developer and I know that the code needs some optimizations, but for now it works better than expected.

Community
  • 1
  • 1
Darwin
  • 1,809
  • 15
  • 17
  • 1
    You're correct, and it's possible, but it isn't simple to get right. UDR, from the BDR project, already supports replicating just one table using logical decoding. You could also use Londiste, which is more mature, but higher overhead. – Craig Ringer Sep 05 '15 at 01:47
  • @CraigRinger thanks for your suggestion, I've read the UDR docs and it seems that is possible to do this but I ran into issues trying to setup and use the `udr-plugin` so I'll leave it for now and will make some tests in the near future. I really appreciate your suggestion. – Darwin Sep 09 '15 at 20:57
  • 1
    FWIW we at 2ndQuadrant are working on streamlining UDR and getting at least the output plugin into 9.6. Londiste is an easier option at present. Feel free to ask about UDR on the pgsql-general mailing list or here on SO. – Craig Ringer Sep 09 '15 at 23:16
  • 1
    There's also someone writing a json output plugin iirc – Craig Ringer Sep 09 '15 at 23:17
  • @CraigRinger thanks again. I've been looking at [test_decoding](http://git.postgresql.org/gitweb/?p=postgresql.git;a=tree;f=contrib/test_decoding;h=3da002b2450e63f6a12f65f23ad0f7a66b3201fa;hb=HEAD), [decoder_json](https://github.com/ildus/decoder_json) and [decoder_raw](https://github.com/michaelpq/pg_plugins/tree/master/decoder_raw). All of them has been very useful to make me understand the whole process. I'm not a C developer, but it was easy update the code and get a working plugin for my current needs. I'll try Londiste and will continue testing UDR on the upcoming weeks. Cheers – Darwin Sep 09 '15 at 23:48
  • @CraigRinger I updated the main post with some notes about my solution, feedback is welcome!. Thanks for your help. – Darwin Sep 11 '15 at 01:41
  • Very impressed, especially as you don't know/work with C. Nicely done. – Craig Ringer Sep 11 '15 at 02:48
  • @Darwin nice solution! Did you think about opening a PR? – Denis Mikhaylov Jan 13 '18 at 16:55

2 Answers2

5

The current version of wal2json has these options:

* `filter-tables` - tables to exclude
* `add-tables`- tables to include

Usage:

pg_recvlogical -slot test_slot -o add-tables=myschema.mytable,myschema.mytable2

Reference: https://github.com/eulerto/wal2json#parameters

Ondra Žižka
  • 43,948
  • 41
  • 217
  • 277
2

According to the documentation you can implement your own synchronous replication solutions by implementing streaming replication interface methods:

  • CREATE_REPLICATION_SLOT slot_name LOGICAL options
  • DROP_REPLICATION_SLOT slot_name
  • START_REPLICATION SLOT slot_name LOGICAL options

In addition to the interface above you also need to implement Logical Decoding Output plugin. In this plugin interface you need to adjust Change Callback operation, which listens to all DML operations:

The required change_cb callback is called for every individual row modification inside a transaction, may it be an INSERT, UPDATE, or DELETE. Even if the original command modified several rows at once the callback will be called individually for each row.

This is the function where you want to check particular table for replication. Also be aware of the fact that Change Callback will NOT handle UNLOGGED and TEMP tables, but I guess it is not severe limitation.

Dmitry S
  • 4,990
  • 2
  • 24
  • 32
  • I've read the docs a lot of times and didn't get it clear before your answer, I'll try to do this and will update this question with the result. Thanks Dmitry – Darwin Sep 05 '15 at 00:49
  • 1
    @Darwin sure. we ran into the same kind of issue a while ago, when we were choosing replication strategies for our postgres clusters. You can definitely go with Slony replication - it's very granular in terms of replication configuration (per table) but way more complex for setup than streaming one, it does not allow to split schemas to different replica-nodes - which was our case - but for your situation it might be a solution,. – Dmitry S Sep 05 '15 at 02:07
  • I've built a custom `output plugin` based on the `test_decoding` from pg sources. I'll try to make a clean version of this and update the question with that in the near future. Thanks for your advice. – Darwin Sep 09 '15 at 21:00
  • I updated the main post with some notes about my solution, feedback is welcome!. Thanks for your help. – Darwin Sep 11 '15 at 01:42