1

I have some data consists of records for 2 tables: pairs and items. These tables are linked with many-to-many relationships. I see 2 possble ways to fill core data entities. Let we have already filled all the items and now we should fill pairs. In both cases "identifier" is an additional text property/field.

Way 1 (NSFetchRequest only):

//get data which should be converted to core data entities
id pairsInfoArray = ...; 
for (id pairInfo in pairsInfoArray) {

    //get items by identifier using NSFetchRequest
    id item1 = ...;
    id item2 = ...;
    //create pair entity
    id pair = ...;
    pair.items = [NSSet setWithObjects:item1, item2, nil];
}

Way 2 (call NSFetchRequest one time only and use NSDictionary/NSMutableDictionary instead):

//get all items via NSFetchRequest
NSArray *itemsObjArray = ...;
//place all the items into array as key = item.identifier, value = item (as object)
NSMutableDictionary *itemsObjDict = ...;
//get data which should be converted to core data entities
id pairsInfoArray = ...;
for (id pairInfo in pairsInfoArray) {

    //get items by key from itemsObjDict
    id item1 = ...;
    id item2 = ...;
    //create pair entity
    id pair = ...;
    pair.items = [NSSet setWithObjects:item1, item2, nil];
}

All my data (not only items and pairs) are filled during 5 minutes (way1) and 45 seconds (way2). it is including time to perform [context save:nil].

As I see the second way works much faster than the first one. But has it any hidden disadvantages? For example wouldn't saving of items to an additional dictionary waste the memory?

user2083364
  • 744
  • 1
  • 7
  • 20

2 Answers2

1

You don't show how you save - that can have quite an effect.

Option 1 is memory efficient but not time efficient.

Option 2 is time efficient but not memory efficient (have you tried running it through Instruments / on an older device?).

You should think about a hybrid solution where you run (and save) batches. Set the batch size by experimenting and profiling. Instruments has a number of Core Data tools to help with this. Your goal is to get the time as low as possible with the minimum amount of memory usage, minimum number of fetches and minimum number of context saves.

Wain
  • 118,658
  • 15
  • 128
  • 151
  • All the items are stored in context (except of this part data with way2, where they are additionally stored in array and dictionary). Context is created before filling the database and `save:` is called when all the data are successfully parsed. But if context wastes a lot of the memory then it wastes the memory in the both cases. – user2083364 Oct 05 '13 at 10:17
  • 1
    The context isn't wasting memory, memory is used when you have fetched the objects. So when you fetch into an array (which is option 2) you have high memory usage. – Wain Oct 05 '13 at 10:31
  • Thanks for your answer. The last question is what if I will manually release an array to prevent wasting of memory at the end of way 2 instead of an autorelease array? – user2083364 Oct 05 '13 at 11:15
  • 1
    That wouldn't change anything. The cause is the number of objects fetched into memory at any one time. – Wain Oct 05 '13 at 11:19
1

Of course there is a big disadvantage.

You are holding all data in the memory in example 2. Of course this is always the fastest way, however not the best for a mobile device.

CoreData is an object graph not a peristence storage. By default it don't uses the memory to create the objects right away. It create the object when you actually start to use it, means working with the properties. Before it just holds a small reference - so called fault. CoreData balances memory against performance and allows you to control what data you load into memory and when. The NSArray you get from a fetch does actually has only few objects ready in memory. The rest becomes real when you access the element. We usually don't have any cases when we need all objects all the time. Usually we have Collectionviews that only display a few informations of some objects, but never all. Of course you could simply fetch all objects and tell him to actually load the objects. Then he will be just as fast as in example two. Maybe faster in some cases as NSSet hast some amazing powers when it comes to intersects and the like.

There are a few good WWDC workshops that show how to deal with this. In general is CoreData always the best solution, unless we talk about trivial data that only cost a few kb in memory and holding them won't do any harm. But you time numbers tell that you have a lot data. Not everything in this data is required at all time. Split it, create an entity that hold the necessary information that you need to display and put the rest in an extra entity. When you need the detail information, you access the extra entity and with a small delay you get the object. (called Lazy Loading)

Try my suggestion in Core data and see how much memory you use up during runtime on a device versus your NSDictionary solution. I think you will see the difference.

Helge Becker
  • 3,219
  • 1
  • 20
  • 33
  • I use NSSet because of a simple reason: all the links have a reverse link. And it is improssible to do it correctly if you have two fields with to-many relationship which are referred to the same table (the second table will allow to set one reverse link only). Thanks for your advice, I'll see the memory later. Item additionally has two double params and other params are links to other amenities. I have checked and I see that the most part of time is spent to create links just between one Pair and two Items. I have even tried to optimize this process but total time wasn't less than 3 minutes. – user2083364 Oct 05 '13 at 11:11
  • My current temp solution is way 3 but it breaks the core data structure. It means categories are not linked with items and contain their identifiers only. The last question is what if I will manually release an array to prevent wasting of memory at the end of way 2? – user2083364 Oct 05 '13 at 11:12