3

Can someone point me in the direction why my system tells me I'm running out of memory for applications when I run the following code over a "large" (37000 rows) text file?

-(void) writeToFile: (NSString*)filePath withSeparator:(NSString*) fieldSep{
NSString* completeFile = [[[NSString alloc] initWithString:@""] autorelease];
for(int i=0;i<[self numberOfRows];i++){
    printf("im at line number... %i of %i\n",i,[self numberOfRows]);
    for(int j=0;j<[self numberOfColumns];j++){
        completeFile = [completeFile stringByAppendingString:[self objectInRow:i column:j]];

        if(j<[self numberOfColumns]-1){
            //separator for all columns except last one
            completeFile = [completeFile stringByAppendingString:fieldSep];
        }
    }
        completeFile = [completeFile stringByAppendingString:@"\n"];
}
NSError *error = nil;
[completeFile writeToFile:filePath atomically:NO
                 encoding:NSStringEncodingConversionAllowLossy error:&error];
if(error){
    NSLog(@"Error writing file at %@\n%@",
          filePath, [error localizedFailureReason]);
}

}

I've added the printf for debugging reasons, first 4000 rows seem to happen instantly, then it slowly slows down... my file contains over 37000 rows similar to these:

1893-11-6   136 194 165
Eduard
  • 309
  • 2
  • 12
  • 2
    This is an example of [Schlemiel the Painter's algorithm](http://en.wikipedia.org/wiki/Schlemiel_the_Painter's_algorithm) in action. – Russell Zahniser Nov 19 '13 at 20:50
  • Well, not Spolsky's example of it; `NSString` doesn't work like `strcat`. :) – Steven Fisher Nov 19 '13 at 20:53
  • @StevenFisher: Here a copy is made each time a string is appended, rather than the linear search for the 0 in Spolsky's `strcat` example. But the principle is the same - appending is O(n), so building the string is O(n^2). – Russell Zahniser Nov 19 '13 at 21:04
  • I should have been more clear. The problem isn't exactly the same, but it's certainly the same principle at work. :) – Steven Fisher Nov 19 '13 at 21:12
  • thanks @RussellZahniser, I have read the wikipedia article and now I feel ashamed. Oh well, better to learn now than after a lot of written code! :) – Eduard Nov 19 '13 at 21:40

3 Answers3

7

When you allocate objects using factory methods, the objects are added to the autoreleasepool. The autoreleasepool is only drained when your event loop runs, after your IBAction has returned.

The trick here would be to put the contents of your loop in its own autorelasepool.

But let's address the biggest problem first. There's a NSMutableString class that you should use here, which would drastically reduce the number of objects you need to create.

We'll switch completeFile to a NSMutableString, constructed using a factory method, then append to it:

-(void) writeToFile: (NSString*)filePath withSeparator:(NSString*) fieldSep{
    NSMutableString* completeFile = [NSMutableString string];
    for(int i=0;i<[self numberOfRows];i++){
        printf("im at line number... %i of %i\n",i,[self numberOfRows]);
        for(int j=0;j<[self numberOfColumns];j++){
            [completeFile appendString:[self objectInRow:i column:j]];

            if(j<[self numberOfColumns]-1){
                //separator for all columns except last one
                completeFile appendString:fieldSep];
            }
        }
            [completeFile appendString:@"\n"];
    }
    NSError *error = nil;
    [completeFile writeToFile:filePath atomically:NO
                     encoding:NSStringEncodingConversionAllowLossy error:&error];
    if(error){
        NSLog(@"Error writing file at %@\n%@",
              filePath, [error localizedFailureReason]);
    }
}

This leaves another problem, though. See that [self objectInRow:i column:j]? It's still (presumably) an autoreleased object. That won't get cleaned up.

We may have made your code run without crashing, depending on the size of the data, but it's a question of when it crashes not if.

To fix this, we need to introduce autoreleasepools. Let's do one per row and per column. This might seem excessive (and, indeed, it is in this case since we've eliminated autoreleasepool use in the outer loop) but autoreleasepools are pretty cheap. If you're doing a loop over a large amount of data, it's just good practice.

You can replace each of your for blocks with @autorelease blocks, for instance:

for(int i=0;i<[self numberOfRows];i++){

With:

for(int i=0;i<[self numberOfRows];i++) @autoreleasepool {

That gives us this code:

-(void) writeToFile: (NSString*)filePath withSeparator:(NSString*) fieldSep{
    NSMutableString* completeFile = [NSMutableString string];
    for(int i=0;i<[self numberOfRows];i++) @autoreleasepool {
        printf("im at line number... %i of %i\n",i,[self numberOfRows]);
        for(int j=0;j<[self numberOfColumns];j++) @autoreleasepool {
            [completeFile appendString:[self objectInRow:i column:j]];

            if(j<[self numberOfColumns]-1){
                //separator for all columns except last one
                completeFile appendString:fieldSep];
            }
        }
            [completeFile appendString:@"\n"];
    }
    NSError *error = nil;
    [completeFile writeToFile:filePath atomically:NO
                     encoding:NSStringEncodingConversionAllowLossy error:&error];
    if(error){
        NSLog(@"Error writing file at %@\n%@",
              filePath, [error localizedFailureReason]);
    }
}

A final note, though. Your error check here is not safe. What happens to an error pointer passed in like this on success isn't defined.

    [completeFile writeToFile:filePath atomically:NO
                     encoding:NSStringEncodingConversionAllowLossy error:&error];
    if(error){
        NSLog(@"Error writing file at %@\n%@",
              filePath, [error localizedFailureReason]);
    }

Instead, you want this:

    BOOL ok = [completeFile writeToFile:filePath atomically:NO
                     encoding:NSStringEncodingConversionAllowLossy error:&error];
    if(!ok){
        NSLog(@"Error writing file at %@\n%@",
              filePath, [error localizedFailureReason]);
    }

This, then, should do what you want.

Steven Fisher
  • 44,462
  • 20
  • 138
  • 192
  • 1
    ARC is not used here. See the call to `autorelease`? – Steven Fisher Nov 19 '13 at 20:53
  • Oops, sorry. Didn't notice. Well, ARC should be used. – Léo Natan Nov 19 '13 at 20:53
  • 2
    Absolutely. And I think **NSMutableString** should be used, too. :) – Steven Fisher Nov 19 '13 at 20:54
  • @LeoNatan even if ARC was in use, it does not replace autoreleased objects with retained ones. Autorelease pools can still be used in tight loops to minimise memory footprint – jrturton Nov 19 '13 at 20:54
  • 2
    @LeoNatan You have misunderstood that article. See http://stackoverflow.com/questions/7825976/under-arc-is-it-still-advisable-to-create-an-autoreleasepool-for-loops for more information. – jrturton Nov 19 '13 at 20:56
  • 1
    Generally, I think it's a mistake to assume you'll get the `objc_retainAutoreleasedReturnValue` optimization (and everything it implies). There's a lot of things that can throw it off. It's better to just think out what you want to do and then express it clearly in code. – Steven Fisher Nov 19 '13 at 20:59
  • Better to not abuse the autorelease system altogether. :) Mutable string is there for a reason. – Léo Natan Nov 19 '13 at 21:00
  • 2
    Absolutely, switch to mutable string. But you're not going to be able to completely avoid an autoreleasepool here. It's a question of *when* it will fail, not *if*: `[self objectInRow:i column:j]` is going to continue to return an autorelased object, for instance. You'll need a pool just for that, eventually. – Steven Fisher Nov 19 '13 at 21:02
  • Thank you very much for your detailed answer, I appreciate it. Thanks too for adding a correction for the error check! – Eduard Nov 19 '13 at 21:31
4

The problem is that each call of stringByAppendingString: creates a new autoreleased NSString objects. However, since the method continues in a loop, autorelease does not get a chance to free these objects.

You can solve it by adding an autorelease pool around the inner loop, like this:

for(int i=0;i<[self numberOfRows];i++){
    printf("im at line number... %i of %i\n",i,[self numberOfRows]);
    @autoreleasepool {
        for(int j=0;j<[self numberOfColumns];j++){
            completeFile = [completeFile stringByAppendingString:[self objectInRow:i column:j]];

            if(j<[self numberOfColumns]-1){
                //separator for all columns except last one
                completeFile = [completeFile stringByAppendingString:fieldSep];
            }
        }
    }
        completeFile = [completeFile stringByAppendingString:@"\n"];
}
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
1

You should use NSMutableString to append these strings.

Léo Natan
  • 56,823
  • 9
  • 150
  • 195
  • 1
    This indeed fixed it, I changed to NSMutableString and used completeFile appendString in stead of stringByAppendingString and it does all 37000 rows almost instantly. Thank you everyone for replying ! – Eduard Nov 19 '13 at 21:29
  • Yeah. NSMutableString is pretty efficient. I don't know exactly what it's algorithm is for resizing, but it's certainly smarter than reallocating on every append. :) – Steven Fisher Nov 19 '13 at 22:12