13

I'm working with a locations database in CouchDB. I created a view where my key is an array with rounded values of latitude and longitude. Now I'm selecting with the following conditions:

Startkey: [ 52.34, 4.883 ]
Endkey:   [ 52.37, 4.903 ]

Here I expect that I'll only receive documents where the latitude is between 52.34 and 52.37. And the longitude between 4.883 and 4.903.

The result I receive:

[ 52.358, 4.919 ]
[ 52.358, 4.919 ]
[ 52.362, 4.861 ]
[ 52.362, 4.861 ]
[ 52.362, 4.861 ]

As you may have noted, in the first result is the longitude greater then the longitude of the endkey, 4.919.

Now I know/read somewhere that I would receive some values which are outside of range of the second item in the array. But how is it possible that the first item already doesn't fit the criteria?

I Googled around a bit and I can't really find an explanation of the startkey/endkey as array. Who can give me a explanation of how CouchDB loops through the documents and defines when to 'start' and when to 'end'?

Lumocra
  • 545
  • 2
  • 6
  • 16

1 Answers1

22

assuming the following data in your view with these keys:

startkey == [a, 11] and endkey == [c, 11]:

[a, 10]
[a, 11]   <-- startkey
[a, 12]   <--
[b, 10]   <--
[b, 11]   <--
[b, 12]   <--
[c, 10]   <--
[c, 11]   <-- endkey
[c, 12]

(Everything marked by an arrow will be returned).

The data in the view are sorted using the key. With startkey and endkey you can control where to start and end in the view. You cannot specify constraints for the data. Everything that is sorted in between startkey and endkey will be returned. Please read http://wiki.apache.org/couchdb/View_collation for more information.

If you want to do geospatial queries you should check out GeoCouch (https://github.com/couchbase/geocouch/).


Summed up: Keys in CouchDB views are stored in one-dimensional lists. Entries in these lists are sorted according to the rules in View_collation. Two dimensional arrays may look special, but in fact they are not. [a, 10] is sorted after [a] and after a and before [b, 5] and before [c] (for example).

If you use startkey and endkey, you say "everything including and after startkey and before and including endkey". startkey and endkey entries do not have to be present in the list.

Theophilus Omoregbee
  • 2,463
  • 1
  • 22
  • 33
thriqon
  • 2,458
  • 17
  • 23
  • If we remove the [a, 10] and [a, 11] from your example above. Would Couch then start at [a, 12]? That is what seems to happen in my situation. – Lumocra Jun 17 '13 at 15:44
  • Yes, exactly that will happen. CouchDB takes the next key that is equal or bigger than the startkey. – thriqon Jun 18 '13 at 07:28
  • Even though it's also bigger then the endkey. Ok, confusing but thanks for the clarification :-) – Lumocra Jun 18 '13 at 14:26
  • it's confusing if you perceive the data as coordinates, yes. But they aren't ;-) They are sorted in one dimension only, and then it's easy to see that [b,12] is in fact smaller than [c,11]. – thriqon Jun 18 '13 at 14:34
  • 1
    Aaah, so at first Couch only looks for the first key. Only when the first key is indeed bigger than the startkey and smaller than the endkey Couch starts looking for the second key, if that one too is bigger than the startkey and smaller than the endkey it is a 'match'. Right? – Lumocra Jun 18 '13 at 15:42
  • [This answer](https://stackoverflow.com/a/49647616/3405291 "CouchDB View - Filter and Group By on Key Array") might be related. – Megidd May 02 '18 at 09:04