0

I've been poking around at Xcode's project format lately, and I'm trying to understand the best way to load it into memory. For reference, I'm using Python for this experiment.

I'm looking for an easy way to read out the contents of a project so I can locate any of the files on disk. (Once I have paths to files, I hope to be able to read from and write to them.)

Xcode's file format is the xcodeproj, which is a package containing several files. The one of interest is the project.pbxproj file, because it contains the file references and logical groups that you see in the lefthand pane in Xcode. I've found a nice reference to for the format at monobjc.net.

The pbxproj file itself is an OpenStep property list. It contains a dictionary which identifies the various files and groups that make up the project. What's most interesting to me is the way the project refers to the files system.

What you need to know from that link is that every framework, file reference, and Xcode project group is represented with a dictionary that has, among other things, a property called isa that specifies the kind of object. Each object also has a unique identifier as a key. Some entries have paths. Files always have them, and groups have them sometimes. Allow me to explain...

Let's take an example file structure, for purposes of illustration:

enter image description here

In this case, I've got a folder named Characters which contains a folder named Warlock. Inside of Xcode, it looks like this:

enter image description here

I've got a group named Characters and a subgroup named Warlock. Essentially, my group structure mirrors my file structure.

In my pbxproj, there are actually two ways to refer to the file on disk. The dictionary representation for the group named "Characters" can have a path attribute, which would point to Characters/Warlock. The Accessories.bundle file would have a path of Accessories.bundle. This tends to be the case if I drag in the entire folder.

The second way to represent this arrangement of the filesystem is to have no path attribute in the group's dictionary, and the file reference will have a more complete path. This will happen if you drag in each file by itself.

I'm trying to figure out the best data structure for traversing these files, considering the group association. I want to be able to read the files, so I need to get complete paths, even if they're relative to the Xcode project.

What's a good way to handle a dictionary in python? What data structure does Apple use for managing Xcode projects in memory?

Charles
  • 50,943
  • 13
  • 104
  • 142
Moshe
  • 57,511
  • 78
  • 272
  • 425
  • Note that the key isn't whether the parent group has a path, but whether the `sourceTree` value for the file (or subgroup) is `` or ``. You can select this individually in the GUI for each file, and your code will have to handle that. – abarnert Oct 17 '13 at 00:29

1 Answers1

1

What's a good way to handle a dictionary in python?

Normally, a dict.

However, in this case, if you're running on a Mac, you might want to use Cocoa's nice property-list APIs, in which case you'll get an NSDictionary instead. That's fine too; either way, the API is the same. For example:

>>> import AppKit
>>> path = os.path.expanduser('~/src/foo.xcodeproj/project.pbxproj')
>>> d = AppKit.NSDictionary.dictionaryWithContentsOfFile_(path)

However, this isn't just a plist; it's an NSArchiver archive, a higher-level structure, more like a Python pickle—it encodes information about ObjectiveC classes, etc.

What data structure does Apple use for managing Xcode projects in memory?

Most likely whatever structure that the archive decodes to. But the core parts are probably mainly an NSDictionary, with values it probably accesses via KVC key paths.

Do you need to use that yourself? Not necessarily. As you'd already determined, the format isn't that deep or that complicated, so just reading the plist as a dict, throwing away everything but the objects value, and building your own filesystem tree out of the result isn't going to be all that hard.

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • Is there a python library for handing Cocoa APIs? I could really just do this in Objective-C, of course. – Moshe Oct 17 '13 at 00:53
  • If you're using Apple's pre-installed Python 2.7.2, pyobjc is already built in. Otherwise, `pip install pyobjc` should just work (assuming you have `setuptools`, `pip`, Xcode, and Xcode's Command Line Tools all installed)—except that if you have Xcode 5.x and almost any third-party binary Python install it will probably fail. (One of many reasons to stick with Apple's pre-installed Python, unless you need 3.x.) – abarnert Oct 17 '13 at 01:00
  • Anyway, that `import AppKit` is actually importing a `pyobjc` wrapper around `AppKit.framework`. As with ObjC, any of `Foundation`, `AppKit`, or `Cocoa` will give you all of the core types and progressively more extra stuff. – abarnert Oct 17 '13 at 01:04
  • If that's the case, I'm better off writing this in Objective-C. – Moshe Oct 17 '13 at 01:09
  • @Moshe: Well, it's up to you, but I personally find Python a much better language for most purposes than ObjC, and PyObjC is just as easy to use as native Cocoa. – abarnert Oct 17 '13 at 17:34