1

I'm looking for some guidance from some iOS Cocoa programmers as to how one might implement a mechanism to load and parse a custom file format into the model objects that I'll be using in memory. I know there must be many ways to crack this nut, but let me share the basic idea of the current path I've explored, and where I became stuck.

But first, here is the context: say I have an existing file format that I cannot change. It's basically an exotic pipe-delimited format that is broken into various sections, each of which starts something like this:

%n|sectionName

...and the n lines that follow are all pipe-delimited in a way that is unique to that section. Some sections have a pipe-delimited header line, followed by n lines of data (also pipe-delimited), and other sections might just have n pipe-delimited lines. There are several short sections towards the beginning of the file, and then finally there will be one huge section that describes the nodes of a k-ary tree: their parent-child relationships and any data associated with each node. All told the size of these files is in the tens of megabytes, perhaps larger in the future.

Finally, the last bit of context is that I'm fairly new to iOS programming.

I started by using NSFileHandle to obtain a representation of the file as an instance of NSData. This was pretty easy, and upon exploring the NSData interface and trying how to proceed from there, I notice the NSCoding protocol, which purports to be a facility for archiving and serialization of objects into (and from) representations.

I thought this sounded like something I might need, since I tend to think of file formats as being just representations that my model objects can be marshaled into. After digging into the "Archives and Serializations Programming Guide", however, I began to second-guess myself. The API didn't seem to lend itself to what I'm trying to accomplish.

Am I going down a blind alley here? Should I be seeking to subclass NSInputStream instead, or should I take some other approach that I'm missing?

pohl
  • 3,158
  • 1
  • 30
  • 48

4 Answers4

3

NSCoding is probably the wrong approach. It's designed for serializing and unserializing Objective-C types, not parsing a custom file format.

There's probably no need to subclass NSInputStream either. Your best bet here is probably to use C's stdio library, in particular fgets to read the lines. If you really want to use NSInputStream or NSFileHandle you certainly can, you'll just have to parse out each line from the string yourself (which is really not that hard).

Anomie
  • 92,546
  • 13
  • 126
  • 145
  • Thank you for reminding me that I have plain C at my disposal. I think I have been too focused on finding the right Objective C class(es) to use. Coincidentally, a new [blog entry by BJ Homer](http://bjhomer.blogspot.com/2011/04/subclassing-nsinputstream.html) yesterday does a good job of explaining a situation where one might actually want to subclass NSInputStream. I agree that my situation doesn't qualify. Thank you again! – pohl Apr 15 '11 at 15:52
2

I recommend using Ragel to handle the parsing smarts. It should be much easier than using NSScanner once you have the basic scaffolding in place to set up the parser and feed bytes into it until parsing finishes.

What objects you want to use to store the parsed results in is up to you. It shouldn't be too hard to build your object graph using action functions triggered by state machine transitions.

How you want to get bytes to feed into Ragel is also up to you. You can use C standard IO streams, Foundation streams, or Foundation file handles. All Ragel cares about is getting its hands on a buffer of characters so it can run it through the state machine your description was compiled into.

NSCoder is likely to be more trouble than it's worth for your purposes. It expects to be used as a way to persist and decode an Obj-C object, with the coding/decoding driven by the object's demands ("Okay, now gimme an int, now a short, how's about an Obj-C object now…").

Jeremy W. Sherman
  • 35,901
  • 5
  • 77
  • 111
  • Thank you for the pointer to Ragel. I'm going to seriously consider that as an option. – pohl Apr 15 '11 at 15:52
1

As you pointed out correctly, there are more than one ways to crack this nut. Unfortunately you did not point out what you want to do with the parsed data and if you want to write out the file in the end.

First, for parsing one has to think if it would make sense to use Objective-C at all. I could think of writing a small helper Perl script which is very suitable for parsing text files and write the output to an XML file or better a plist file. This file could then be read in with your Objective-C code and you could work with the data. You could also choose to write the data into a sqlite database which is also a suitable file format as there are data connectors for a wide area of languages available (C, Perl, Python, etc.).

Second, if you want to parse the text file a class worth having a look at is NSLineScanner which is used to parse a text file.

I do not see any benefits using NSInputStream as it returns only raw bytes.

Edit

This preprocessing using another language is not possible on iOS devices AFAIK. So this option is only possible on the mac.

Community
  • 1
  • 1
GorillaPatch
  • 5,007
  • 1
  • 39
  • 56
  • Thank you for responding. Unfortunately I'm not at liberty to change the format by pre-processing it. I probably won't have to write it back out. As for what I'm going to do with the data: I'm going to create instances of objects or structs (of my own devising) in memory. What happens thereafter is navigation, display, and visualization. I'll keep the tree structure in memory, although the data associated with each node might be something I can load on demand as nodes are visited by the user. – pohl Apr 05 '11 at 20:08
  • Well the pre-processing is just to make your life easier and implement a private, easy to use data format. You can also do everything in objective-c, but for text-parsing perl is much more powerful and you could bundle it with your application and nobody will ever know that you use it internally. – GorillaPatch Apr 06 '11 at 07:35
  • Is there a perl interpreter in iOS? Would perl scripts would make it through app store submission? – pohl Apr 06 '11 at 14:50
  • Ahh you got me. Of course not. I haven't read your question carefully enough. So you have to parse your strings in C, C++ or Objective-C. Sorry. – GorillaPatch Apr 06 '11 at 15:25
1

There are a few open source parsing related kits, each targeting slightly different purposes. One or none of this might be useful to you, but mentioning them in response to your question seems like it might be useful to others, at least.

Gary W. Longsine
  • 642
  • 9
  • 12