0

I'm using two great libraries BPlusTree and Protobuf-net to store/retrieve a large number of items to/from disk. I'm allowed to modify any of the serialized items ... everything works perfect till this point. At first modification the speed drops to 1/3, at second modification it drops to 1/4 and so on as shown on following plot:

items insertion to B+Tree speed Each line represents a different run on same data; the important point is that at all tests speed degrades when an item from the collection is modified. The modified item is a list of a class; till first degradation (i.e., 22th group - roughly 22*200,000th item) this list contains only one instance. After that items one by one are updated to have more two objects of the class, till 42th group (roughly 42*200,000th item) when items start having 3 instances each, and so on so forth.

My items are from class 'B' which implements like:

public class B<C, M>
{
    internal B(char tau, M metadata)
    { 
        _lambda = new List<Lambda<C, M>>();
        _lambda.Add(new Lambda<C, M>(tau: tau, atI: metadata));
    }

    [ProtoMember(1)]
    internal int omega { private set; get; }

    [ProtoMember(2)]
    private List<Lambda<C, M>> _lambda { set; get; }

    internal ReadOnlyCollection<Lambda<C, M>> lambda { get { return _lambda.AsReadOnly(); } }

    internal B<C, M> Update(char tau, M metadata)
    {
        B<C, M> newB= new B<C, M>();
        newB._lambda = new List<Lambda<C, M>>(this._lambda);
        newB._lambda.Add(new Lambda<C, M>(tau: tau, atI: metadata));
        return newB;
    }
}

public class Lambda<C, M>
{
    internal Lambda(char tau, M atI)
    {
        this.tau = tau;
        this.atI = atI;
    }

    [ProtoMember(1)]
    internal char tau { private set; get; }

    [ProtoMember(2)]
    internal M atI { private set; get; }
}

And I define my (de)serializer as following:

public B<C, M> ReadFrom(System.IO.Stream stream)
{
    return Serializer.DeserializeWithLengthPrefix<B<C, M>>(stream, PrefixStyle.Fixed32);
}

public void WriteTo(B<C, M> value, System.IO.Stream stream)
{
    Serializer.SerializeWithLengthPrefix<B<C, M>>(stream, value, PrefixStyle.Fixed32);
}

How can I say _lambda<...> size is the cause of speed drop ? Please check the following plot for clarification. As you notice, the moment _lambda<...> size is changed I start having speed penalties.

enter image description here

Any suggestion what is going wrong ?

PS: There are thousands of lines doing the work, but narrowing down the code seems like the problem is raised by 'ReadFrom' and 'WriteTo' functions. Hence I'm putting only these lines here.

Dr. Strangelove
  • 2,725
  • 3
  • 34
  • 61
  • Is the number shown on the y axis the number of *root* objects? Or the *total* number of objects? For example, a root with 1 item in a list might be 2 serializable objects; a root with 3 items in a lost is 4. It also isn't clear what the lambda does here. Does it still behave the same without that? – Marc Gravell Oct 26 '14 at 20:58
  • @MarcGravell , unfortunately it's not clear to me what you mean by *root* object ... would you mind please clarify ? – Dr. Strangelove Oct 26 '14 at 22:57
  • @MarcGravell `_lambda` is the cause of all these troubles! When `Update` adds an item to `_lambda` I start having speed penalties. – Dr. Strangelove Oct 26 '14 at 22:58
  • @Hamad ok, ignore what i was saying about lambda - it makes more sense on a second (etc) read. What i mean by the root object thing is: are you measuring calls to Deserialize*? Or the total number of objects including `B<...>` and `Lambda<...>` instances? Again: a root and 3 list items is twice as many objects as a root and 1 list item... – Marc Gravell Oct 26 '14 at 23:04
  • @Hamad perhaps what I really mean is: what can I do to repro what you are seeing? – Marc Gravell Oct 26 '14 at 23:07
  • @MarcGravell what I'm counting on Y-axis is total `B<...>` which could have an unpredictable number of `Lambda<...>` in `_lambda`. – Dr. Strangelove Oct 26 '14 at 23:11
  • @MarcGravell if you need I can send you a simplified code which you can repro. – Dr. Strangelove Oct 26 '14 at 23:12
  • but the total number of objects is kinda important... – Marc Gravell Oct 26 '14 at 23:13
  • @MarcGravell `_lambda`'s length for most applications varies between 1 to 100. But a specific number can't really be determined. – Dr. Strangelove Oct 26 '14 at 23:16
  • then how do you know that the graph doesn't simply represent the net data? – Marc Gravell Oct 26 '14 at 23:25
  • @MarcGravell I mailed you a short code to repro ;-) I hope that helps understand the issue easier. – Dr. Strangelove Oct 28 '14 at 10:04
  • yes, I have it; I will try to get to it ASAP, but I also have a job to do ;p – Marc Gravell Oct 28 '14 at 10:08

0 Answers0