2

I am learning RavenDB (Build 2851, Version 2.5.0 / 6dce79a) from Beginning Raven 2.x and am finding that the Raven-Studio is not filtering correctly.

I have a table of cities in my database, storing their populations, locations etc. I have added an index in the code, using this:

public class Cities_ByPopulation : AbstractIndexCreationTask<City>
{
   public Cities_ByPopulation()
   {
      this.Map = cities => from city in cities 
                           select new { Population = city.Population };

      // Generates as this in the RDBMS
      // docs.Cities.Select(city => new {
      //     Population = city.Population
      // })
   }
}

And registering it with the IndexCreation.CreateIndex(typeof(Cities_ByPopulation).Assembly, documentStore) code.

Problem 1 - Raven Studio is not filtering as expected

Now the index is added to RavenDB, and I run a filter the Population [long] field on the Raven Studio, filtering between 200'000 and 500'000.

IDE not showing results correctly

As you can see, its pulling back values completely out of the range. I have also tried with Population: [Lx200000 TO Lx500000] but then no results appear.

To verify this I created a dynamic index, but have the same problem:

Raven Studio filtering with a dynamic index

Problem 2 - LINQ is not filtering at all as expected

In addition to this, I'm finding that even with a raw LINQ query, no data is returned at all!

// RavenStore stores a singleton, 
// so I can share across console apps in this solution
using (var store = RavenStore.GetDocumentStore())
{
    IndexCreation.CreateIndexes(typeof(Cities_ByPopulation).Assembly, store);

    const long MinRange = 200000;
    const long MaxRange = 300000;

    Debug.Assert(MinRange < MaxRange, "Ranges need swapping round!");

    // Get cities using the index
    using (var session = store.OpenSession())
    {
        var cities =
            session.Query<City>("Cities/ByPopulation")
                .Customize(x => x.WaitForNonStaleResults())
                .Where(x => x.Population > MinRange && x.Population < MaxRange);

            Console.WriteLine("Number of normal cities within population range: {0}", cities.Count());
    }

    // Get cities from raw query
    using (var session = store.OpenSession())
    {
        var cities = session.Query<City>().Where(x => x.Population > MinRange && x.Population < MaxRange);

        Console.WriteLine("Number of normal cities within population range: {0}", cities.Count());
    }

    // Output :
    // Number of normal cities within population range: 0
    // Number of normal cities within population range: 0
}

The logging for this query is as follows

Request # 275: GET     -     1 ms - <system>   - 200 - /docs/Raven/Databases/World
Request # 276: HEAD    -     0 ms - World      - 200 - /indexes/Cities/ByPopulation
Request # 277: PUT     -     2 ms - World      - 201 - /indexes/Cities/ByPopulation
Request # 278: GET     -     0 ms - World      - 404 - /docs/Raven/Replication/Destinations
Request # 279: GET     -     6 ms - World      - 200 - /indexes/Cities/ByPopulation?&query=Population_Range%3A%7BLx200000%20TO%20Lx300000%7D&pageSize=0&operationHeadersHash=1690003523
        Query: Population_Range:{Lx200000 TO Lx300000}
        Time: 6 ms
        Index: Cities/ByPopulation
        Results: 0 returned out of 0 total.

Request # 280: GET     -     7 ms - World      - 200 - /indexes/dynamic/Cities?&query=Population_Range%3A%7BLx200000%20TO%20Lx300000%7D&pageSize=0&operationHeadersHash=1690003523
        Query: Population_Range:{Lx200000 TO Lx300000}
        Time: 6 ms
        Index: Cities/ByPopulation
        Results: 0 returned out of 0 total.

Some additional info that may help troubleshooting

  • The data was imported via the CSV importer.
  • No objects have been stored from a .NET application, only read.

This may imply that the schemas are not in sync, or the DB isn't sure of the data types yet, as the metadata is {}


Here is the resulting JSON from a document:

[city/1989]
{
  "Name": "Aachen",
  "CountryCode": "D",
  "Province": "Nordrhein Westfalen",
  "Population": 247113,
  "CountryId": "country/1009"
}

and C# class:

public class City
    {
        public string Id { get; set; }
        public string Name { get; set; }
        public string CountryCode { get; set; }
        public long Population { get; set; }
        public string Province { get; set; }
        public string CountryId { get; set; }
    }
}

Another update

I've manually patched the collection with

this['@metadata']['Raven-Clr-Type'] = "Domain.City, Domain"

but this hasn't helped the serializer either.

enter image description here

Dominic Zukiewicz
  • 8,258
  • 8
  • 43
  • 61
  • Is your `Population` number stored as text in RavenDB? It appears it is since you are only getting results that start with 2, 3, or 4 - strings that are lexographically between 200000 and 500000. Maybe the CSV importer you used doesn't automatically identify numeric columns? – Timothy Shields Mar 31 '14 at 17:35
  • Good question! But no, no " " around it. I can replicate it quite easily so if anyone wants a closer look, let me know. – Dominic Zukiewicz Mar 31 '14 at 17:38
  • You might want to double check that. If it is doing a numeric comparison, why are you getting values out of range but only lexicographically between your min and max? If numeric comparison isn't working, you should be getting cities with populations starting with 1, 5, 6, 7, 8, 9 as well. Also, country code values don't display any quotes around them either, despite being strings. – Timothy Shields Mar 31 '14 at 17:42
  • @TimothyShields: Question updated with JSON sample and C# class. – Dominic Zukiewicz Mar 31 '14 at 18:33

1 Answers1

2

You have to tell Raven, that Population is a number, because all values are stored as text. So in your index-constructor write something like

Sort(x => x.Population , SortOptions.Long);
Dominic Zukiewicz
  • 8,258
  • 8
  • 43
  • 61
esskar
  • 10,638
  • 3
  • 36
  • 57
  • So just to clarify; I've imported as a CSV. It _looks_ like RavenDB assumed it was a value. But the index needs to be given a hint of the data type? Does this only apply to CSVs, as the data types are implicit with POCOs? – Dominic Zukiewicz Mar 31 '14 at 19:11
  • 1
    No, i think you have to specify always the sort type; see also http://ravendb.net/docs/2.0/client-api/querying/static-indexes/customizing-results-order?version=2.0 for more clarification. – esskar Mar 31 '14 at 19:14
  • I see! By default it assumes its a string for sorting. Dates are implicitly sorted, but integer types need to be specified. Thanks for the link! – Dominic Zukiewicz Mar 31 '14 at 19:16