This has me stumped. I'm trying to read a 6MB CSV file on iOS line by line. I've tried using plain C file pointers and NSInputStream polling but settled on the following which felt the cleanest. All three approaches result in what seems like a random block of reads returning success but fill the buffer with all null bytes. I say "random" but it has consistency. The reads stop working at the exact same point when re-running the program, and the number of reads is suspicious (more on that below).
- (id)initWithFileAtPath:(NSString *)path {
if ((self = [super init])) {
filePath = [path copy];
queue = [[NSOperationQueue alloc] init];
queue.maxConcurrentOperationCount = 1;
buffer = [[NSMutableString alloc] init];
bytes = malloc(CHUNK_SIZE * sizeof(UTF8Char));
}
return self;
}
- (void)dealloc {
[filePath release];
[queue release];
[buffer release];
free(bytes);
[super dealloc];
}
- (void)stream:(NSInputStream *)stream handleEvent:(NSStreamEvent)eventCode {
switch (eventCode) {
case NSStreamEventOpenCompleted:
break;
case NSStreamEventHasBytesAvailable:
[queue addOperationWithBlock:^{
[self readChunk: stream];
[self drainBuffer];
}];
break;
case NSStreamEventEndEncountered:
if ([buffer length] > 0) {
[delegate reader:self didReadLine:[NSString stringWithString:buffer]];
[buffer setString:@""];
}
[stream close];
[stream removeFromRunLoop:[NSRunLoop currentRunLoop]
forMode:NSDefaultRunLoopMode];
[stream release];
[delegate readerDidFinishReading:self];
break;
default:
NSLog(@"StreamReader: event %d", eventCode);
break;
}
}
- (void)enumerateLines {
NSInputStream *stream = [[NSInputStream alloc] initWithFileAtPath:filePath];
stream.delegate = self;
[stream scheduleInRunLoop:[NSRunLoop currentRunLoop]
forMode:NSDefaultRunLoopMode];
[stream open];
}
- (void)readChunk: (NSInputStream*)stream {
NSInteger readSize = [stream read:bytes maxLength:CHUNK_SIZE];
if (readSize) {
if (bytes[0] == '\0') {
NSLog(@"null buffer %d", readSize);
}
NSString *string = [[NSString alloc] initWithBytes:bytes
length:readSize
encoding:NSUTF8StringEncoding];
[buffer appendString:string];
[string release];
} else {
NSLog(@"StreamReader: read zero bytes");
}
}
- (void)drainBuffer {
static NSCharacterSet *newlines = nil;
if (newlines == nil) {
newlines = [NSCharacterSet newlineCharacterSet];
}
NSRange newlinePos;
while ((newlinePos = [buffer rangeOfCharacterFromSet:newlines]).location != NSNotFound) {
NSString *line = [buffer substringToIndex:newlinePos.location];
// remove the line from the buffer along with line separator
[buffer deleteCharactersInRange: (NSRange){0, [line length]}];
while ([buffer length] > 0 && [newlines characterIsMember:[buffer characterAtIndex:0]]) {
[buffer deleteCharactersInRange:(NSRange){0, 1}];
}
[delegate reader:self didReadLine: line];
}
}
While reading the 6MB file, twice I will get a series of 96 "bad reads" when CHUNK_SIZE is 1024. If CHUNK_SIZE is 512 there will be a series of 192 "bad reads". What do I mean by "bad reads"? The NSInputStream read message returns success, and no error event occurs in the delegate callback. Yet the bytes
buffer has all null values.
- iOS 7.0.4, iPad 2
- does NOT happen on desktop
- does NOT happen in simulator
- decreasing file size to aprox. 1MB "fixes" the problem on the iPad
It's most likely worth noting that I instantiate the reader class while on the main UI thread.
So... am I doing something subtly (or not subtly) wrong here? Or have I uncovered some sort of obscure iOS bug?