0

I want to minimize memory usage when writing data to a CSV file.

For bigger tables it uses more memory, even if it is temporary.

Could someone suggest how to reduce memory usage?

Maybe I could separate the action for bigger tables, write more files and then merge them but I didn't try that yet, maybe I'm missing something obvious.

Here is the code currently used:

 @autoreleasepool {
    NSOutputStream *csvStream = [[NSOutputStream alloc] initToMemory];
    [csvStream open];

    CHCSVWriter *writer = [[CHCSVWriter alloc] initWithOutputStream:csvStream encoding:NSUTF8StringEncoding delimiter:';'];
    NSArray *keySortDescriptors = @[[NSSortDescriptor sortDescriptorWithKey:@"self" ascending:YES]];
    if (writeHeader==YES) {
        //> write header
        NSMutableDictionary *firstRow = [[self sharedUploadManager].modifiedRows firstObject];
        if (firstRow==nil) {
            result = NO;
            return result;
        }

        NSArray *orderedKeys = [[firstRow allKeys] sortedArrayUsingDescriptors:keySortDescriptors];
        for (NSString *columnName in  orderedKeys) {
            [writer writeField:columnName];
        }
    }
    [writer finishLine];

    @autoreleasepool {
        //> write the rows
        for (NSMutableDictionary *row in [self sharedUploadManager].modifiedRows) {

            NSArray *orderedKeys = [[row allKeys] sortedArrayUsingDescriptors:keySortDescriptors];

            for (NSString *key in orderedKeys ) {

                NSString *field = [row objectForKey:key];
                if ([field isKindOfClass:[NSNull class]]) {
                    [writer writeField:nil];
                } else {
                    [writer writeField:field];
                }
            }

            //> finish the line
            [writer finishLine];
        }
    }

    [writer closeStream];

    NSData *buffer = [csvStream propertyForKey:NSStreamDataWrittenToMemoryStreamKey];
    NSString *output = [[NSString alloc] initWithData:buffer encoding:NSUTF8StringEncoding];

    if (![[NSFileManager defaultManager] fileExistsAtPath:csvPath]) {
        [[NSFileManager defaultManager] createFileAtPath:csvPath contents:nil attributes:nil];
    }

    BOOL res = [[output dataUsingEncoding:NSUTF8StringEncoding] writeToFile:csvPath atomically:NO];

    if (!res) {
        NSLog(@"Error Creating CSV File path = %@", csvPath);
    } else{
        NSLog(@"Data saved! File path = %@", csvPath);

    }
}

I also tried this logic before - a bit cleaner, but with a same result:

NSOutputStream *csvStream = [[NSOutputStream alloc] initToFileAtPath:csvPath append:YES];
[csvStream open];

CHCSVWriter *writer = [[CHCSVWriter alloc] initWithOutputStream:csvStream encoding:NSUTF8StringEncoding delimiter:';'];

if (writeHeader==YES) {
    //> write header
    NSMutableDictionary *firstRow = [rows firstObject];
    if (firstRow==nil) {
        result = NO;
        return result;
    }

    NSArray *orderedKeys = [[firstRow allKeys] sortedArrayUsingDescriptors:@[[NSSortDescriptor sortDescriptorWithKey:@"self" ascending:YES]]];

    for (NSString *columnName in  orderedKeys) {
        [writer writeField:columnName];
    }
    [writer finishLine];
}


//> write the rows
for (NSMutableDictionary *row in rows) {

    NSArray *orderedKeys = [[row allKeys] sortedArrayUsingDescriptors:@[[NSSortDescriptor sortDescriptorWithKey:@"self" ascending:YES]]];

    for (NSString *key in orderedKeys ) {

        NSString *field = [row objectForKey:key];
        if ([field isKindOfClass:[NSNull class]]) {
            [writer writeField:nil];
        }
        else {
            [writer writeField:field];
        }

    }
    //> finish the line
    [writer finishLine];
}
[writer closeStream];
silentBob
  • 170
  • 9

1 Answers1

1

If you don't want to use a lot of memory when creating a large CSV file then don't create a memory-based output stream. Create an output stream to an actual file. Then the CSV data will be written to a file and not memory. Then the file can be gigabytes and use very little memory.

This has the added benefit of not needing to access the buffer data, create a string from it (now double the memory usage), and then writing the string to a file.

NSOutputStream *csvStream = [NSOutputStream outputStreamToFileAtPath:csvPath append:NO];
[csvStream open];
CHCSVWriter *writer = [[CHCSVWriter alloc] initWithOutputStream:csvStream encoding:NSUTF8StringEncoding delimiter:';'];

// write your CSV entries

[writer closeStream];

That's it. No other code needed to create the file.

In addition to these changes, you need to change where you use a the autorelease pool. It should be inside the outer for loop.

//> write the rows
for (NSMutableDictionary *row in [self sharedUploadManager].modifiedRows) {
    @autoreleasepool {
        NSArray *orderedKeys = [[row allKeys] sortedArrayUsingDescriptors:keySortDescriptors];

        for (NSString *key in orderedKeys ) {

            NSString *field = [row objectForKey:key];
            if ([field isKindOfClass:[NSNull class]]) {
                [writer writeField:nil];
            } else {
                [writer writeField:field];
            }
        }

        //> finish the line
        [writer finishLine];
    }
}

This will ensure the memory of autoreleased objects is cleared after each row.

rmaddy
  • 314,917
  • 42
  • 532
  • 579
  • Hello, thanks for the answer but I actually tried that as well, still keeps growing for bigger files, this code is posted was just an attempt to see if something works differently... I can see the allocations growing in during a debug session and with Allocations tool. With Allocations it seems like writeField consumes a lot, so with a nearly million lines the app crashes. This is not a usual situation now for the app, but it could be one day and I'm trying to handle that. – silentBob Mar 20 '19 at 08:05
  • @maddy I just noticed that your logic doesn't use append, I'll try to see what happens with that. – silentBob Mar 20 '19 at 08:12
  • Nope, still the same... The files themselves are not that big, it seems there is a leak, but I can't detect it... Biggest file is around 20 MB. – silentBob Mar 20 '19 at 09:38
  • Thanks, I did that for both for loops here (modified objects can also be big), it looks better now, but memory still raises. – silentBob Mar 21 '19 at 08:37