Here is a demo which demonstrates the problem with not having transaction "locking". It sort of simulates async / concurrency using setTimeout. I have never dealt with concurrency in languages like C, Go, or Rust, so I am not really sure how it works in implementation detail, but I am trying to grasp the concept of MVCC.
const db = {
table1: {
records: [
{ id: 1, name: 'foo', other: 'hello' },
{ id: 2, name: 'bar', other: 'world' },
]
}
}
function readTable1(id) {
return db.table1.records.find(x => x.id === id)
}
function writeTable1(id) {
const record = readTable1(id)
return new Promise((res, rej) => {
console.log('transaction 1 start')
setTimeout(() => {
record.other = 'qwerty'
setTimeout(() => {
record.name = 'asdf'
console.log('transaction 1 done')
res()
}, 1000)
}, 1000)
})
}
function wait(ms) {
return new Promise((res) => setTimeout(res, ms))
}
async function test1() {
writeTable1(1)
console.log(readTable1(1))
await wait(1100)
console.log(readTable1(1))
await wait(2200)
console.log(readTable1(1))
}
test1()
It logs
transaction 1 start
{ id: 1, name: 'foo', other: 'hello' } // read
{ id: 1, name: 'foo', other: 'qwerty' } // read
transaction 1 done
{ id: 1, name: 'asdf', other: 'qwerty' } // read
In the middle while the transaction is processing the record, it changes the real record which can be concurrently read. There are no locks on it, or however MVCC does it without locks (using multiple versions of records). I next try to implement how I think MVCC works, with hopes that you can correct my understanding. Here is that.
const db = {
table1: {
records: [
[{ id: 1, name: 'foo', other: 'hello' }],
[{ id: 2, name: 'bar', other: 'world' }],
]
}
}
function readTable1(id) {
const idx = db.table1.records.findIndex(x => x[0].id === id)
return [idx, db.table1.records[idx][0]]
}
// this is a long transaction.
function writeTable1(id) {
const [idx, record] = readTable1(id)
// create a new version of record for transaction to act on.
const newRecordVersion = {}
Object.keys(record).forEach(key => newRecordVersion[key] = record[key])
db.table1.records[idx].push(newRecordVersion)
return new Promise((res, rej) => {
console.log('transaction 2 start')
setTimeout(() => {
newRecordVersion.other = 'qwerty'
setTimeout(() => {
newRecordVersion.name = 'asdf'
console.log('transaction 2 done')
// now "commit" the changes
commit()
res();
}, 1000)
}, 1000)
})
function commit() {
db.table1.records[idx].shift()
}
}
function wait(ms) {
return new Promise((res) => setTimeout(res, ms))
}
async function test1() {
writeTable1(1)
console.log(readTable1(1)[1])
await wait(1100)
console.log(readTable1(1)[1])
await wait(2200)
console.log(readTable1(1)[1])
console.log(db.table1.records)
}
test1()
That outputs this, which seems correct.
transaction 2 start
{ id: 1, name: 'foo', other: 'hello' }
{ id: 1, name: 'foo', other: 'hello' }
transaction 2 done
{ id: 1, name: 'asdf', other: 'qwerty' }
[
[ { id: 1, name: 'asdf', other: 'qwerty' } ],
[ { id: 2, name: 'bar', other: 'world' } ]
]
Is this correct, generally how it works? Mainly, how many versions per record are created in a real implementation? Can there be more than 2 versions at a time? If so, in what situations does that occur generally speaking? And how do the timestamps work? I read about the timestamps on the wiki page, but it doesn't really register to me how to implement it. Also the incrementing transaction IDs. So basically how those 3 pieces fit together (versioning, timestamps, and transaction IDs).
I am looking for some sort of simulation of the timestamps and versioning in JavaScript, so I can make sure I understand the general concepts at a high level, yet at a sort of rough approximation of an implementation level. Just knowing what MVCC is and reading a few papers isn't enough in the weeds to know how to implement it.
In my example there will only ever be 2 versions of a record during a transaction. I am not sure if there are cases where you would need more than that. And I am not sure how to plug in the timestamps.