9

I'm testing RavenDB for my future projects. Database performance is a must requirement for me, that's why I want to be able to tune RavenDB to be at least in SQL Server's performance range, but my tests shows that raven db is approximately 10x-20x slower in select queries than SQL Server, even when RavenDB is indexed and SQL Server doesn't have any indexes.

I populated database with 150k documents. Each document has a collection of child elements. Db size is approx. 1GB and so is index size too. Raven/Esent/CacheSizeMax is set to 2048 and Raven/Esent/MaxVerPages is set to 128. Here's how the documents looks like:

{
  "Date": "2028-09-29T01:27:13.7981628",
  "Items": [
    {
      {
      "ProductId": "products/673",
      "Quantity": 26,
      "Price": {
        "Amount": 2443.0,
        "Currency": "USD"
      }
    },
    {
      "ProductId": "products/649",
      "Quantity": 10,
      "Price": {
        "Amount": 1642.0,
        "Currency": "USD"
      }
    }
  ],
  "CustomerId": "customers/10"
}


public class Order
{
    public DateTime Date { get; set; }
    public IList<OrderItem> Items { get; set; }
    public string CustomerId { get; set; }
}

public class OrderItem
{
    public string ProductId { get; set; }
    public int Quantity { get; set; }
    public Price Price { get; set; }
}

public class Price
{
    public decimal Amount { get; set; }
    public string Currency { get; set; }
}

Here's the defined index:

from doc in docs.Orders
from docItemsItem in ((IEnumerable<dynamic>)doc.Items).DefaultIfEmpty()
select new { Items_Price_Amount = docItemsItem.Price.Amount, Items_Quantity = docItemsItem.Quantity, Date = doc.Date }

I defined the index using Management studio, not from code BTW (don't know if it has any negative/positive effect on perfromance).

This query takes from 500ms to 1500ms to complete (Note that this is the time that is needed to execute the query, directly shown from ravendb's console. So it doesn't contain http request time and deserialization overhead. Just query execution time).

session.Query<Order>("OrdersIndex").Where(o =>
    o.Items.Any(oi => oi.Price.Amount > 0 && oi.Quantity < 100)).Take(128).ToList();

I'm running the query on quad core i5 cpu running at 4.2 GHz and the db is located on a SSD.

Now when I populated same amount of data on sql server express, with same schema and same amount of associated objects. without index, sql server executes the same query which includes joins in 35ms. With index it takes 0ms :|.

All tests were performed when db servers were warmed up.

Though, I'm still very satisfied with RavenDB's performance, I'm curious if I am missing something or RavenDB is slower than a relational database? Sorry for my poor english.

Thanks

UPDATE

Ayande, I tried what you suggested, but when I try to define the index you sent me, I get the following error:

public Index_OrdersIndex()
    {
        this.ViewText = @"from doc in docs.Orders
select new { Items_Price_Amount = doc.Items(s=>s.Price.Amount), Items_Quantity = doc.Items(s=>s.Quantity), Date = doc.Date }
";
        this.ForEntityNames.Add("Orders");
        this.AddMapDefinition(docs => from doc in docs
            where doc["@metadata"]["Raven-Entity-Name"] == "Orders"
            select new { Items_Price_Amount = doc.Items(s => s.Price.Amount), Items_Quantity = doc.Items.(s => s.Quantity), Date = doc.Date, __document_id = doc.__document_id });
        this.AddField("Items_Price_Amount");
        this.AddField("Items_Quantity");
        this.AddField("Date");
        this.AddField("__document_id");
        this.AddQueryParameterForMap("Date");
        this.AddQueryParameterForMap("__document_id");
        this.AddQueryParameterForReduce("Date");
        this.AddQueryParameterForReduce("__document_id");
    }
}

error CS1977: Cannot use a lambda expression as an argument to a dynamically dispatched operation without first casting it to a delegate or expression tree type

bmargulies
  • 97,814
  • 39
  • 186
  • 310
Davita
  • 8,928
  • 14
  • 67
  • 119
  • Surprised to see poor performance from Raven DB. This would indicate there is a serious problem with underlying data structure or index definition. – Arnold Zokas Jan 15 '12 at 00:46
  • 1
    Davita, this doesn't seems right at all. Queries like that should take a maximum of 50 ms or so. – Ayende Rahien Jan 15 '12 at 02:27
  • 1
    Can you create a reproducable test case and send it to the mailing list? We value perf very highly, and treat such scenarios as bugs. For what it worth, your scenario is well within our operating parameters, and the numbers you quote are far outside what we experienced, so something is wrong here. A test case would be very helpful to figure out what is going on – Ayende Rahien Jan 15 '12 at 02:28
  • Thanks guys, that was the answers I was expecting for :). @Ayende, could you be more specific please what you mean by test case? I can give you full db dump + vs project. Is there anything else I can do? Thanks again, I appreciate your help :) – Davita Jan 15 '12 at 09:36
  • @AyendeRahien I don't know if you received my mail to your mailing list, so I'll post it here too. you can download the database and project from here: http://dl.dropbox.com/u/3055964/Data.7z http://dl.dropbox.com/u/3055964/Raven.Sample.SimpleClient.7z – Davita Jan 15 '12 at 14:40
  • @AyendeRahien Any news about the issue..? :( – Davita Jan 16 '12 at 10:22
  • I didn't get the email, I got the files now, downloading and will look at this shortly – Ayende Rahien Jan 16 '12 at 13:59
  • 2
    Got it, and reproduced locally, please give us a few days to figure out what is going on – Ayende Rahien Jan 16 '12 at 16:21
  • @AyendeRahien I'm very happy to hear that. Thank you Ayande, I appreciate your help. Please, when you'll find out the cause, let us know from here possible fix/solution/estimates. Thanks again – Davita Jan 16 '12 at 17:18

1 Answers1

7

Davita, The following index generate ~8 million index entries:

from doc in docs.Orders
from docItemsItem in ((IEnumerable<dynamic>)doc.Items).DefaultIfEmpty()
select new { Items_Price_Amount = docItemsItem.Price.Amount, Items_Quantity = docItemsItem.Quantity, Date = doc.Date }

This one generates far less:

from doc in docs.Orders
select new { Items_Price_Amount = doc.Items(s=>s.Price.Amount), Items_Quantity = doc.Items.(s=>s.Quantity), Date = doc.Date }

And can be queried with the same results, but on our tests showed up to be about twice as fast.

The major problem is that you are making several range queries, which are expensive with a large number of potential values, and then you have a large number of actual matches for the query.

Doing an exact match is significantly faster, by the way.

We are still working on ways to try to speed things up.

Ayende Rahien
  • 22,925
  • 1
  • 36
  • 41
  • 1
    Thanks Ayende, I will check the solution in a few hours and let you know the results. However, it would be great if we could speed up indexed lookups, 8 million entries is that much for slowing down DBMS I think. Anyway thank you very much for your great work and support :) – Davita Jan 17 '12 at 07:09
  • The problem isn't with the # of entries, it is with the query type. We have to keep a lot of data in memory for this, and it is expensive. – Ayende Rahien Jan 19 '12 at 07:06
  • Try this with with an exact match, or just a single range query, and you'll see a major difference. – Ayende Rahien Jan 19 '12 at 07:06