4

I use a cloudant couchdb and I've noticed see that the "_changes" query on the database returns an "update_sequence" that is not a number, e.g.

"437985-g1AAAADveJzLYWBgYM..........".

What is more, the response is not stable: I get 3 different update_sequences if a query the db 3 times.

Is there any change in the known semantics of the "update_sequence", "since", etc. or what?

Regards, Vangelis

2 Answers2

2

Paraphrasing an answer that Robert has previously given:

The update sequence values are opaque. In CouchDB, they are currently integers but, in Cloudant, the value is an encoding of the sequence value for each shard of the database. CouchDB will likely adopt this in future as clustering support is added (via the BigCouch merge).

In both CouchDB and Cloudant, _changes will return a "seq" value with every row that is guaranteed to return newer updates if you pass it back as "since". In cases of failover, that might include changes you've already seen.

So, the correct way to read changes since a particular update sequence is this;

  1. call /dbname/_changes?since=
  2. read the entire response, applying the changes as you go
  3. Record the last_seq value as your new checkpoint seq value.

Do not interpret the two values, you can't compare them for equality. You can, if you need to, record any "seq" value as you go in step 2 as your current checkpoint seq value. The key thing you cannot do is compare them.

Will Holley
  • 1,745
  • 10
  • 11
  • So you're saying that there is a guarantee that the most recent document will be the last document in the `results` array, and its `seq` value should be used for the subsequent `since`-parameter? What's confusing me is that [the docs](http://docs.couchdb.org/en/2.0.0/api/database/changes.html) say this: > ...warning:: > The results returned by _changes are partially ordered. – kristianlm Jan 20 '17 at 10:13
  • There is no guarantee that the most recent document will be the last document in the `results` array. CouchDB/Cloudant will guarantee that the set of documents returned includes at least all documents changed following the provided sequence value (and it may return docs changed prior to that as well), but does not guarantee the order of those returned. – Will Holley Jan 24 '17 at 17:10
  • I see. So which `seq` do you pick for the next `last_seq`? You say "record any seq value", but what if you want the latest one - not just any one? Isn't this a very common scenario? What am I missing out on here? – kristianlm Jan 26 '17 at 08:35
  • Use the `last_seq` value in the result or the last `seq` value you process in your code. – Will Holley Jan 26 '17 at 16:52
  • Ah, how could I possible have missed that `last_seq` in the root of the json response. Thanks for the help guys! – kristianlm Jan 26 '17 at 18:00
1

It'll jump around, the representation is a packed base64 string representing the update_seq of the various replicas of each shard of your database. It can't be a simple integer because it's a snapshot of a distributed database.

As for CouchDB, treat the update_seq as opaque JSON and you'll be fine.

Robert Newson
  • 4,631
  • 20
  • 18