0

I have a large table of data (rendered using AG-Grid) and I want update it in the Postgres backend, but the best approach to the next part has me prevaricating, in terms of the amount of work, and the best course of actions.

Using the fast-json-patch library, I can get a JSON patch list easily enough in the client, and then something roughly thus:

import * as jsonpatch from 'fast-json-patch'

postData = jsonpatch.compare(originalData, updatedData)

const request = new Request(url, {
    method: 'PATCH',
    body: JSON.stringify(postData),
    headers: new Headers({
      Accept: 'application/json',
      'Content-Type': 'application/json-patch',
      Authorization: 'Bearer ' + user.token,
    }),
  })

and then in the ExpressJS 'backend' iterate through a bunch of jsonb_set queries to update Postgres.

Alternatively, I could fetch the record to be updated from Postgres, then use fast-json-patch to patch the JSONB data within the ExpressJS backend, and then update the Postgres record in one go?

This isn't something I've done before, but it's the sort of thing I'm sure must be pretty common. What's the best general approach?


Update

I've tried to implement the second approach - my issue now is with locking/unlocking Postgres when I have JSONB fields to update. My issues now are to do with actually implementing the record locking and updates from the express side, specifically trying to handle the asynchronous nature of working with the pg backend.

I just wondered if someone could spot the (non-deliberate) errors in this ham-fisted attempt:

const express = require('express')
const bodyParser = require('body-parser')
const SQL = require('sql-template-strings')
const { Client } = require('pg')
const dbConfig = require('../db')
const client = new Client(dbConfig)
const jsonpatch = require('fast-json-patch')

// excerpt for patch records in 'forms' postgres table

const patchFormsRoute = (req, res) => {
  const { id } = req.body
  const jsonFields = [ 'sections', 'descriptions' ]
  const possibleFields = [ 'name','status',...jsonFields ]
  const parts = []
  const params = [id] // stick id in as first param
  let lockInUse = false

  // find which JSONB fields are being PATCHed.
  // incoming JSONB field values are already JSON 
  // arrays of patches to apply for that particular field

  const patchList = Object.keys(req.body)
    .filter(e => jsonFields.indexOf(e) > -1)

  client.connect()

  if (patchList.length > 0) {
    const patchesToApply = pullProps(req.body, jsonFields)
    lockInUse = getLock('forms',id)
    // return record from pg as object with just JSONB field values
    const oldValues = getCurrentValues(patchList, id)
    // returns record with patches applied
    const patchedValues = patchValues( oldValues , patchesToApply )
  }

  possibleFields.forEach(myProp => {
    if (req.body[myProp] != undefined) {
      parts.push(`${myProp} = $${params.length + 1}`)
      if (jsonFields.indexOf(myProp) > -1) {
        params.push(JSON.stringify(patchedValues[myProp]))
      } else {
        params.push(req.body[myProp])
      }
    }
  })

  result = runUpdate(client, 'forms', parts, params)

  if(lockInUse) {
    releaseLock(client, 'forms', id)
  }

  client.end()

  return result
}

// helper functions to try and cope with async nature of pg

function async getLock(client, tableName, id ) {
  await client.query(SQL`SELECT pg_advisory_lock(${tableName}::regclass::integer, ${id});`)
  return true
}

function async releaseLock(client, tableName, id) {
  await client.query(SQL`SELECT pg_advisory_unlock(${tableName}::regclass::integer, ${id});`)
}

function async getCurrentValues(client, fieldList, id) {
  const fl = fieldList.join(', ')
  const currentValues = await client
    .query(SQL`SELECT ${fl} FROM forms WHERE id = ${id}`)
    .then((result) => {return result.rows[0]})
  return currentValues
}

function pullProps(sourceObject, propList) {
  return propList.reduce((result, propName) => {
    if(sourceObject.hasOwnProperty(propName)) result[propName] = sourceObject[propName]
    return result
  }, {})
}

function patchValues(oldValues, patches) {
  const result = {}
  Object.keys(oldValues).forEach(e => {
    result[e] = jsonpatch.apply( oldValues[e], patches[e] );
  })
  return result
}

function async runUpdate(client, tableName, parts, params) {
  const updateQuery = 'UPDATE ' + tableName + ' SET ' + parts.join(', ') + ' WHERE id = $1'
  const result = await client
    .query(updateQuery, params)
    .then(result => {
      res.json(result.rowCount)
    })
  return result
}
Dycey
  • 4,767
  • 5
  • 47
  • 86
  • I guess you might need to `START TRANSACTION` and `COMMIT` it after the `UPDATE` and you can lock the row by `SELECT FOR UPDATE` statement, it will give you the data and lock it at the same time till you `COMMIT`, so you do not need `getLock` and `releaseLock` – karianpour Mar 29 '20 at 14:43

1 Answers1

1

Use the 2nd method. PostgreSQL has no edit-in-place feature for JSONB. It is always going to include making a full copy. You might as well do that in the client, which seems to be have better tools for it.

An exception might be if the patch is small and the JSONB is huge and your network is slow.

jjanes
  • 37,812
  • 5
  • 27
  • 34
  • Those are good points. My only thought was that there might be some way to make the operation atomic at the Postgres end. However, I'll give 2 a go and hope ;-) Thanks. – Dycey Mar 29 '20 at 07:57
  • Not sure if this counts, but couldn't the `jsonb_set` command be used to edit in place? `jsonb_set(target jsonb, path text[], new_value jsonb [, create_missing boolean])` ? Anyway, trying to work through it in the ExpressJS app, using Postgres advisory locking ... looks like fun ;-) – Dycey Mar 29 '20 at 09:42
  • It isn't really edit-in-place, as jsonb_set returns an edited copy. It does not receive its first argument by reference and edit that reference in place. It can spare network traffic, but doesn't spare taking a full copy of the data just to change one part of it. Also, you would have to chain the together to change multiple things. I think SELECT...FOR UPDATE is sufficient, using advisory locks for this seems unnecessary. – jjanes Mar 29 '20 at 14:04
  • Except... the postgres JSONB field contains an JSON array of objects, n x m, that's being updated by multiple users concurrently - they each have their own column, and can only update that column. So the diffs have to be generated at the client, in order to ensure that only their column's changes are noted, and then sent through in the PATCH request. – Dycey Mar 29 '20 at 14:18