2

I've a table along the lines of

CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  author_id INTEGER NOT NULL FOREIGN KEY REFERENCES(author.id)
  content XML NOT NULL,
  created TIMESTAMP WITH TIMEZONE NOT NULL DEFAULT CURRENT_TIMESTAMP
)

I want to transition the database from using XML to JSONB to store that data. I'd like to do this in a 3 stage process:

  1. Create some views/functions which convert the existing data to JSONB on-the-fly; surfaced via a Postgrest API
  2. Migrate the existing system from using XML to using JSONB, retire the old API code
  3. Convert the column from XML to JSONB permanently

This allows the software to work off the old API (using XML data) whilst I'm refactoring parts of it to use use the new API (JSONB). Once the refactoring's complete, I can switch off the old XML - happy days.

I've found this answer for converting the data from XML to JSONB: How to convert a XML into JSONB within Postgres

With some modifications, I can get it to be a (very) close fit to my existing code. I can, in short, convert this:

<xml>
  <hello>world</hello>
  <this is="your">captain</this>
</xml>

to this

{
    "xml": {
        "@tag": "xml",
        "this": {
            "@tag": "this",
            "@value": "captain",
            "@attributes": {
                "is": "your"
            }
        },
        "hello": {
            "@tag": "hello",
            "@value": "world",
            "@attributes": {}
        },
        "@value": null,
        "@attributes": {}
    }
}

EDIT: above has been slightly modified thanks to feedback from @404

But I'll need some way of allowing the refactored code to send JSONB to new (Postgrest) API, and then that data to be converted to XML so that it can still be used by the non-refactored code.

I'd looked at the XML functions, but I'm struggling to make it so that I can dynamically convert the JSONB back to the XML form (doesn't matter overly if it's a little bit messy as long as I can get back to the JSON form later). It looks like I can do this OK

SELECT xmlelement(name hello, null, 'world')

But when I try to do this

SELECT xmlelement(name key, null, value)
FROM jsonb_each('{
        "this": {
            "@value": "captain",
            "@attributes": {
                "is": "your"
            }
        },
        "hello": {
            "@value": "world",
            "@attributes": {}
        },
        "@value": null,
        "@attributes": {}
    }'
)

As you'd expect, you get XML output like this...

<key>{"@value": "captain", "@attributes": {"is": "your"}}</key>

... where the XML element is called "key" and not the value of the key column (in this case, "this").

I have similar problems when using the xmlattributes function - I can't give the attribute a name that comes from the database query.

Is there a way of dynamically setting the XML element tag names and attribute names?

Update: Read the comments below; the above is probably the wrong approach in any case. I've left the question open in case there is a way, since it might help someone in the future ... but the takeaway here for me was that I needed to re-think the data structure first.

Algy Taylor
  • 814
  • 13
  • 29
  • 1
    A couple of thoughts. I don't care for the JSON structure created by the xml conversion. I would structure it like so: `{ "name": "xml", "value": null, "attributes": null, "children": [ { "name": "this", "value": "captain", "attributes": [ { "name": "is", "value": "your" } ] }, { "name: "hello", ... } ] }`. The changed key names aren't important, what's more important is (1) not using values as key names, because then the JSON schema can be consistent across entries, allowing you to select the item's name without knowing what it is; and (b) using a specific property (cont...) – 404 Jan 16 '20 at 12:57
  • 1
    ... to hold child elements, as that allows you to more easily create a function which can recursively call itself for each element in `current_item->children`, as it walks down the however many nested elements there may be, returning the same structure for each. So basically I think you need a better JSON structure, and that in turn will allow you to much more easily perform the opposite conversion. – 404 Jan 16 '20 at 12:59
  • Since I can't edit the comment, the child object (`"name": "this"`) should also have had a `children` property (or maybe it can omitted if there are no children), so that each representation of an xml element is the same structure, no matter what level it's at. – 404 Jan 16 '20 at 13:56
  • @404 - thanks for the comments. Point taken about the JSON structure, but there's also reuse after conversion to consider. I'd potentially need to traverse an entire array to see if there's a "hello" element. I've added a '@tag' attribute to the JSON above since that seems a reasonable (and probably necessary) compromise, but still thinking about how best to adapt the rest! – Algy Taylor Jan 16 '20 at 14:13
  • 1
    "*I'd potentially need to traverse an entire array to see if there's a "hello" element*" - Ok, I suppose I was coming from a standpoint where you'd never be looking for a specific element, more about a translation-friendly structure. IF you're creating your "production" json structure, then I think it's even more important to change the structure, because right now it's super tied in to xml (e.g. "attributes"). I would say: forget translation and xml, just figure out what you want it to look like first, THEN see how to translate between the two (maybe posting a question here). – 404 Jan 16 '20 at 14:23
  • 1
    Just to add, I had to do something similar some time ago: we were storing this complex xml structure in postgres, then wanted to change it to json. But we needed both versions for some time. So we defined the json structure we wanted, then used a view to expose both the xml version and the json version (converted on the fly in the view, and converted via a trigger from json to xml on writes). Pretty similar to what you're accomplishing here I guess. – 404 Jan 16 '20 at 14:25
  • @404 - thanks for your help & advice here, it's much appreciated. I thought you were right...then I read the last comment and it's made me pretty certain you are. I owe you a drink, you've saved me _a lot_ of time (even if it's not the answer I came for, ha). I'll leave the question open in case anyone can answer it, though ... – Algy Taylor Jan 16 '20 at 14:44
  • Nice one :thumbsup: – 404 Jan 16 '20 at 15:02

0 Answers0