I've a table along the lines of
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
author_id INTEGER NOT NULL FOREIGN KEY REFERENCES(author.id)
content XML NOT NULL,
created TIMESTAMP WITH TIMEZONE NOT NULL DEFAULT CURRENT_TIMESTAMP
)
I want to transition the database from using XML to JSONB to store that data. I'd like to do this in a 3 stage process:
- Create some views/functions which convert the existing data to JSONB on-the-fly; surfaced via a Postgrest API
- Migrate the existing system from using XML to using JSONB, retire the old API code
- Convert the column from XML to JSONB permanently
This allows the software to work off the old API (using XML data) whilst I'm refactoring parts of it to use use the new API (JSONB). Once the refactoring's complete, I can switch off the old XML - happy days.
I've found this answer for converting the data from XML to JSONB: How to convert a XML into JSONB within Postgres
With some modifications, I can get it to be a (very) close fit to my existing code. I can, in short, convert this:
<xml>
<hello>world</hello>
<this is="your">captain</this>
</xml>
to this
{
"xml": {
"@tag": "xml",
"this": {
"@tag": "this",
"@value": "captain",
"@attributes": {
"is": "your"
}
},
"hello": {
"@tag": "hello",
"@value": "world",
"@attributes": {}
},
"@value": null,
"@attributes": {}
}
}
EDIT: above has been slightly modified thanks to feedback from @404
But I'll need some way of allowing the refactored code to send JSONB to new (Postgrest) API, and then that data to be converted to XML so that it can still be used by the non-refactored code.
I'd looked at the XML functions, but I'm struggling to make it so that I can dynamically convert the JSONB back to the XML form (doesn't matter overly if it's a little bit messy as long as I can get back to the JSON form later). It looks like I can do this OK
SELECT xmlelement(name hello, null, 'world')
But when I try to do this
SELECT xmlelement(name key, null, value)
FROM jsonb_each('{
"this": {
"@value": "captain",
"@attributes": {
"is": "your"
}
},
"hello": {
"@value": "world",
"@attributes": {}
},
"@value": null,
"@attributes": {}
}'
)
As you'd expect, you get XML output like this...
<key>{"@value": "captain", "@attributes": {"is": "your"}}</key>
... where the XML element is called "key" and not the value of the key column (in this case, "this").
I have similar problems when using the xmlattributes
function - I can't give the attribute a name that comes from the database query.
Is there a way of dynamically setting the XML element tag names and attribute names?
Update: Read the comments below; the above is probably the wrong approach in any case. I've left the question open in case there is a way, since it might help someone in the future ... but the takeaway here for me was that I needed to re-think the data structure first.