7

I'm writing a download manager in Objective-C which downloads file from multiple segments at the same times in order to improve the speed. Each segement of the file is downloaded in a thread.

At first, I thought to write each segment in a different file and to put together all the files at the end of the download. But for many reasons, it's not a good solution.

So, I'm searching a way to write in a file at a specific position and which is able to handle multiple thread because in my application, each segment is downloaded inside a thread. In Java, I know that FileChannel does the trick perfectly but I have no idea in Objective-C.

  • 1
    http://stackoverflow.com/questions/7182785/concurrent-writes-to-a-file-using-multiple-threads – rooftop Feb 16 '12 at 22:36
  • @rooftop did you read the question and look at the tags? This has nothing to do with the linux kernel. – Jack Lawrence Feb 16 '12 at 22:51
  • The question I linked to described techniques used to write to a file with multiple threads in C, while you're correct that the question was mainly about performance of kernel level functionality, it still had information that might help steer Sylvain in the right direction, which is why I simply posted as a comment and not an anser. – rooftop Feb 17 '12 at 15:13

3 Answers3

10

The answers given thus far have some clear disadvantages:

  • File i/o using system calls definitely has some disadvantages regarding locking.
  • Caching parts in memory leads to serious issues in a memory constrained environment. (i.e. any computer)

A thread safe, efficient, lock free approach would be to use memory mapping, which works as follows:

  • create the result file of (at least) the total length needed
  • open() the file for read/write
  • mmap() it to some place in memory. The file now "lives" in memory.
  • write received parts in memory at the right offset in the file
  • keep track if all pieces have been received (e.g. by posting some selector on the main thread for every piece received and stored)
  • munmap() the memory and close() the file

The actual writing is handled by the kernel - your program will never issue a write system call of any form. Memory mapping generally has little downsides and is used extensively for things like shared libraries.

update: a piece of code says more than 1000 words... This is the mmap version of Mecki's lock-based multi-thread file writer. Note that writing is reduced to a simple memcpy, which cannot fail(!!), so there is no BOOL success to check. Performance is equivalent to the lock based version. (tested by writing 100 1mb blocks in parallel)

Regarding a comment on "overkill" of an mmap based approach: this uses less lines of code, doesn't require locking, is less likely to block on writing, requires no checking of return values on writing. The only "overkill" would be that it requires the developer to understand another concept than good old read/write file I/O.

The possibility to read directly into the mmapped memory region is left out, but is quite simple to implement. You can just read(fd,i_filedata+offset,length); or recv(socket,i_filedata+offset,length,flags); directly into the file.

@interface MultiThreadFileWriterMMap : NSObject
{
@private
    FILE * i_outputFile;
    NSUInteger i_length;
    unsigned char *i_filedata;
}

- (id)initWithOutputPath:(NSString *)aFilePath length:(NSUInteger)length;
- (void)writeBytes:(const void *)bytes ofLength:(size_t)length
      toFileOffset:(off_t)offset;
- (void)writeData:(NSData *)data toFileOffset:(off_t)offset;
- (void)close;
@end

#import "MultiThreadFileWriterMMap.h"
#import <sys/mman.h>
#import <sys/types.h>

@implementation MultiThreadFileWriterMMap

- (id)initWithOutputPath:(NSString *)aFilePath length:(NSUInteger)length
{
    self = [super init];
    if (self) {
        i_outputFile = fopen([aFilePath UTF8String], "w+");
        i_length = length;
        if ( i_outputFile ) {
            ftruncate(fileno(i_outputFile), i_length);
            i_filedata = mmap(NULL,i_length,PROT_WRITE,MAP_SHARED,fileno(i_outputFile),0);
            if ( i_filedata == MAP_FAILED ) perror("mmap");
        }
        if ( !i_outputFile || i_filedata==MAP_FAILED ) {
            [self release];
            self = nil;
        }
    }
    return self;
}

- (void)dealloc
{
    [self close];
    [super dealloc];
}

- (void)writeBytes:(const void *)bytes ofLength:(size_t)length
      toFileOffset:(off_t)offset
{
    memcpy(i_filedata+offset,bytes,length);
}

- (void)writeData:(NSData *)data toFileOffset:(off_t)offset
{
    memcpy(i_filedata+offset,[data bytes],[data length]);
}

- (void)close
{
    munmap(i_filedata,i_length);
    i_filedata = NULL;
    fclose(i_outputFile);
    i_outputFile = NULL;
}

@end
mvds
  • 45,755
  • 8
  • 102
  • 111
  • mmap data is also cached in memory, so your second disadvantage applies equally to your own solution. Further your solution is a problem if the downloaded file is bigger than the available address space. Try this approach with a 4 GB file download on a 32 bit system... or try it with 10 parallel downloads, each file 500 MB. Since you have program code, library code and other allocated memory, a mmap of 1-2 GB may already fail. The speed improvement of mmap will be almost unmeasurable, since if there is a bottleneck, it will rather be the network receiving code. – Mecki Feb 17 '12 at 02:32
  • @Mecki: nope, mmapped data is thrown out of memory just as easily by the kernel, that's the big advantage. And if you're paranoid, you can even *tell* the kernel to throw stuff out (`man madvise`). The whole disk I/O is transparent, you just write to memory. You're right about downloading >2GB files to 32 bit systems. Speed is not the main concern indeed, the big win is that you just have a lot less things to worry about. – mvds Feb 17 '12 at 02:49
  • (sorry, that should have read "mmapped data *backed by a file* is thrown out of memory just as easily") – mvds Feb 17 '12 at 02:50
  • and if you're really brave, you can even read from the socket directly into the mmapped region ;-) – mvds Feb 17 '12 at 02:53
  • I didn't say, that mmap data cannot be easily thrown out of memory, I said it is cached in memory, which was the second disadvantage on your list. C file handles need very little "own" memory (a couple of KB per file handle), except for in-kernel disk cache, and in kernel disk cache is even easier to "recycle" than mmap cache, since it is actually treated almost like free memory in kernel and it's application transparent. I think for the very simple job of downloading, it's pretty much overkill. No browser, not even a download accelerator I know the source of uses mmap. – Mecki Feb 19 '12 at 00:38
  • @Mecki: I added a code sample so you can see for yourself how simple it actually is. It is an almost-drop-in-replacement of your code (we need to know the length). Your understanding of `mmap` misses the point: `mmap` lets you access the very same physical bits in memory as your "in-kernel disk cache". That is the whole point of shared/virtual memory. So the "mmap cache" *is* in fact the "in-kernel disk cache", there is *no difference* and the "recycling" rules are therefore identical. – mvds Feb 19 '12 at 12:08
  • If it is not really faster, what is the point of it? Try using my code to download 3 different versions of Xcode from Apples developer site using a 32 bit binary; works just fine. Try that with your code and watch the binary crash when the second download starts. Also your code steals the system tons of cache memory, since mmap memory has precedence over transparent disk cache, thus while your code is running, all other background apps suffer by much worse file access times. I see no advantage in your implementation and Instrument says it stresses my system ten times more than my code. – Mecki Feb 23 '12 at 23:18
  • @Mecki: The point is, *I* think it's simpler and more efficient. That's all. Have a look at `man madvise`. – mvds Feb 24 '12 at 00:37
  • 1
    This answer is fundamentally flawed. "This is the mmap version of Mecki's lock-based multi-thread file writer. Note that writing is reduced to **a simple memcpy, which cannot fail(!!)**" The `memcpy()` **CAN** fail. `ftrunctate()` can produce a sparse file. Thus, the file can be larger than the available space in the file system. Data copied into the `mmap()`'d file via `memcpy()` will then cause a `SIGBUS` to be delivered to the process when the updated data can't be written to disk. Example: http://oss.oetiker.ch/rrdtool/forum.en.html#nabble-td6282143 – Andrew Henle Feb 02 '17 at 21:07
  • @AndrewHenle interesting scenario! You can argue if the flaw is not actually in the OS that happily hands out diskspace or memory that is not available. I've seen the discussions about overcommit, but never realised this also happens at the filesystem level. – mvds Feb 03 '17 at 00:19
  • The biggest drawback or this approach is the file size is decided initially, there is no option to increase the file content in append mode. Changing from "w+" to "a+" will not solve the issue. The memcpy will crash if the file is growing than the initially decided length. Another question will be when we should munmap the content for a 24x7 running application. – Shihab Jan 15 '19 at 18:35
4

Queue up the segment-objects as they are received to a writer-thread. The writer-thread should keep a list of out-of-order objects so that the actual disk-writing is sequential. If a segment download fails, it can be pushed back onto the downloading thread pool for another try, (perhaps an internal retry-count should be kept). I suggest a pool of segment-objects to prevent one or more failed download of one segment resulting in runaway memory use as later segments are downloaded and added to the list.

Martin James
  • 24,453
  • 3
  • 36
  • 60
3

Never forget, Obj-C bases on normal C and thus I would just write an own class, that handles file I/O using standard C API, which allows you to place the current write position anywhere within a new file, even far beyond the current file size (missing bytes are filled with zero bytes), as well as jumping forward and backward as you wish. The easiest way to achieve thread-safety is using a lock, this is not necessary the fastest way but in your specific case, I bet that the bottleneck is certainly not thread-synchronization. The class could have a header like this:

@interface MultiThreadFileWriter : NSObject
{
    @private
        FILE * i_outputFile;
        NSLock * i_fileLock;
}
- (id)initWithOutputPath:(NSString *)aFilePath;
- (BOOL)writeBytes:(const void *)bytes ofLength:(size_t)length
    toFileOffset:(off_t)offset;
- (BOOL)writeData:(NSData *)data toFileOffset:(off_t)offset;
- (void)close;
@end

And an implementation similar to this one:

#import "MultiThreadFileWriter.h"

@implementation MultiThreadFileWriter

- (id)initWithOutputPath:(NSString *)aFilePath
{
    self = [super init];
    if (self) {
        i_fileLock = [[NSLock alloc] init];
        i_outputFile = fopen([aFilePath UTF8String], "w");
        if (!i_outputFile || !i_fileLock) {
            [self release];
            self = nil;
        }
    }
    return self;
}

- (void)dealloc
{
    [self close];
    [i_fileLock release];
    [super dealloc];
}

- (BOOL)writeBytes:(const void *)bytes ofLength:(size_t)length
    toFileOffset:(off_t)offset
{
    BOOL success;

    [i_fileLock lock];
    success = i_outputFile != NULL
        && fseeko(i_outputFile, offset, SEEK_SET) == 0
        && fwrite(bytes, length, 1, i_outputFile) == 1;
    [i_fileLock unlock];
    return success;
}

- (BOOL)writeData:(NSData *)data toFileOffset:(off_t)offset
{
    return [self writeBytes:[data bytes] ofLength:[data length]
        toFileOffset:offset
    ];
}

- (void)close
{
    [i_fileLock lock];
    if (i_outputFile) {
        fclose(i_outputFile);
        i_outputFile = NULL;
    }
    [i_fileLock unlock];
}
@end

The lock could be avoided in various way. Using Grand Central Dispatch and Blocks to schedule the seek + write operations on a Serial Queue would work. Another way would be to use UNIX (POSIX) file handlers instead of standard C ones (open() and int instead of FILE * and fopen()), duplicate the handler multiple times (dup() function) and then placing each of them to a different file offset, which avoids further seeking operations on each write and also locking, since POSIX I/O is thread-safe. However, both implementations would be somewhat more complicating, less portable and there would be no measurable speed improvement.

Mecki
  • 125,244
  • 33
  • 244
  • 253