0

While playing around with the _changes API from CouchDB (2.1.1), I noticed that the seq number of the resulting records is different when I add the ?include_docs=true. Is this something expected? If yes, can somebody help me understand the logic behind it?

Further info:

  • Create a Database, I am calling it test: Create a new Test Database

  • Create three files on this new test database. These can just have the id, nothing else:

Test Files created

  • Now call the API twice, one with the ?include_docs=true and one without. enter image description here

Call with ?include_docs=true

enter image description here

Call without ?include_docs=true

enter image description here

As you can see, the id value on both requests is different looking by the order of entries on the response, also, the seq hash looks like the same but its 'string part' is different by the end of it. So my question is, why they are not the same, given that I am only wanting to add the doc reference? Is this something expected? If yes, can somebody explain?

Frankra
  • 133
  • 8
  • @Flimzy, actually I am comparing the results of the `_changes` endpoint for the same database with and without the `?include_docs=true`. What I tried to show on the last two screenshots is that not only the seq hash is different but also the order of the document, and this is what I am not able to understand... If you check the 3rd screenshot (the one with the chrome windows side by side) you will see that is the same db, same files, but they have different orders and `seq` hashes, even though the only thing different is the addition of the `?include_docs=true` on the URL – Frankra Aug 28 '18 at 07:29
  • Oh right, makes sense. – Jonathan Hall Aug 28 '18 at 08:25
  • Possible duplicate of [Sequence number bug in CouchDB 2 or is there another way to compare sequence numbers?](https://stackoverflow.com/questions/44447505/sequence-number-bug-in-couchdb-2-or-is-there-another-way-to-compare-sequence-num) – Hypnic Jerk Aug 29 '18 at 19:07

1 Answers1

1

You have to consider that the _changes feed is not fully ordered. The changes are being computed from the set of nodes and shards of the database. CouchDB retrieves the changes from the different shards and then combine them into a single stream.

The logic behind the sequence numbers of the changes feed is quite complex and reflects the state of each of the shards that have responded to the changes feed request.

If you decode the string part of the seq binary_to_term(couch_util:decodeBase64Url(EncodedStringHere)) you will see the state of the shards used to compose the chages entry for that doc.

The question here is that you can not rely in the changes feed order as it may change.

Juanjo Rodriguez
  • 2,103
  • 8
  • 19
  • Even though I can sort of understand why they could be out of order, I still don't understand why the `seq` hash is completely different for the same resource (see the one with `id: "first"` on the third screenshot). As per my understanding, the `seq` means the sequence in which the change was added to the database, this is also what I understood from the couchdb docs. What I would logically expect is that maybe the result array comes with the seqs out of order, but a document with an `id:X` and `seq:Y` on _changes should also have the `id:X` and `seq:Y` on `_changes?include=docs`. – Frankra Aug 28 '18 at 08:25
  • Seq is computed on each request and it depends on the respones obtained from the different shards/nodes on your system. The the changes recovery strategy could be different when you request the docs and it result in a different seq number for the same doc.Just don't rely on this seq number in your logic. The only you can rely on is the last_seq. – Juanjo Rodriguez Aug 28 '18 at 09:53