0

Using Realm 1.0.2 on OS X, I have a Realm file that reached ~3.5 Gb. Now, writing a batch of new objects takes around 30s - 1min on average, which makes things pretty slow. After profiling, it clearly looks like commitWriteTransaction is taking a big chunk of time.

Is that performance normal / expected in that case? And if so, what strategies would be available to make that saving time faster?

Kettch
  • 1
  • 1
  • Hi Kettch. Are you indeed storing ~3.5 Gb worth of data in your Realm, or is the Realm file significantly larger than the amount of data you are actually storing? – AustinZ Aug 01 '16 at 17:55
  • You're most definitely not using `autoreleasepool` around your background thread realm instances. And obviously you should, see [example](http://stackoverflow.com/a/34087874/2413303) – EpicPandaForce Aug 01 '16 at 18:59
  • Yep, there is indeed 3.5 Gb of data in there. Apart from objects that are less numerous, that amounts to ~60 million objects with a few doubles each (now that I'm thinking about it, I could probably get away with floats.. I'll have to try that). As for the autorelease pool, the batches of write are actually done in a method that is called each time from the callback of a NSURLConnection, so the autorelease pool would have time to drain in between each commit. – Kettch Aug 03 '16 at 16:44
  • Just tried more "aggressively" encompassing the transactions in autorelease pools, but the commit times are still increasing quite a lot. From a mere dozen milliseconds in the first batches, it reaches 0.5 - 1s for a DB of 600 Mb, so I guess there's something else here... – Kettch Aug 04 '16 at 18:17

1 Answers1

0

Realm uses copy-on-write semantics whenever changes are performed in write transactions.

The larger the structure that has to be forked & copied, the longer it will take to perform the operation.

This small unscientific benchmark on my 2.8GHz i7 MacBook Pro

import Foundation
import RealmSwift

class Model: Object {
  dynamic var prop1 = 0.0
  dynamic var prop2 = 0.0
}

// Add 60 million objects with two Double properties in batches of 10 million
autoreleasepool {
  for _ in 0..<6 {
    let start = NSDate()
    let realm = try! Realm()
    try! realm.write {
      for _ in 0..<10_000_000 {
        realm.add(Model())
      }
    }
    print(realm.objects(Model.self).count)
    print("took \(-start.timeIntervalSinceNow)s")
  }
}

// Add one item to that Realm
autoreleasepool {
  let start = NSDate()
  let realm = try! Realm()
  try! realm.write {
    realm.add(Model())
  }
  print(realm.objects(Model.self).count)
  print("took \(-start.timeIntervalSinceNow)s")
}

Logs the following:

10000000
took 25.6072470545769s
20000000
took 23.7239990234375s
30000000
took 24.4556020498276s
40000000
took 23.9790390133858s
50000000
took 24.5923230051994s
60000000
took 24.2157150506973s
60000001
took 0.0106720328330994s

So you can see that adding many objects to the Realm, with no relationships, is quite fast and stays linearly proportional to the number of objects being added.

So it's likely that you're doing more than just adding objects to the Realm, maybe you're updating existing objects, causing them to be copied?

If you're reading a value from all objects as part of your write transactions, that will also grow proportionally to the number of objects.

Avoiding these things will shorten your write transactions.

jpsim
  • 14,329
  • 6
  • 51
  • 68
  • I'm not doing any query in there, and no update on any existing object. But the objects I'm inserting are containing a bit more than just a bit of doubles: they also have an indexed NSDate, and a relation to another kind of object (the other way around is a readonly RLMLinkingObjects). Could those be what's causing those growing times? – Kettch Aug 04 '16 at 18:50
  • An indexed property alone will still keep things linear. The relationship is likely the expensive part to fork in your case. I think I'd need to see some code in order to help you further. – jpsim Aug 05 '16 at 17:21