3

I'm working on some code that retrieves a section of a Wikipedia page as an NSString. I've found a constructed link online that returns the raw data of a section. For instance, to get the first section of the Wikipedia page on 'Boston', you would go to: http://en.wikipedia.org/w/index.php?title=Boston&action=raw&section=0.

And what I'm trying to achieve, is to convert that raw data into what can be seen on the normal Wikipedia page: http://en.wikipedia.org/wiki/Boston.

Now, at first, I thought I'd use regular expressions to parse out blocks that start with {{ and end with }}. However, this proved to be problematic, and it deleted necessary text.

Then, I thought I could somehow find a wiki markup to html converter (present everywhere online) for Objective-C, but I had no luck there.

There are several similar questions on SO, but none of them seem to be clearly resolved: Getting Wikipedia Article Summary using NSScanner Problem.

So, to resume, does anyone know how to parse a wiki page into an NSString?

Thank you in advance.

Community
  • 1
  • 1
elliottbolzan
  • 1,057
  • 1
  • 15
  • 30

1 Answers1

1

Use a PEG WikiText parser such as kiwi: https://github.com/AboutUs/kiwi

You can find kiwi's parsing output rules here: https://github.com/AboutUs/kiwi/blob/master/src/syntax.leg

You will need to download peg/leg to compile the leg file: http://piumarta.com/software/peg/

Regexident
  • 29,441
  • 10
  • 93
  • 100
  • That does seem useful, but do you know if there's an Objective-C wrapper for it? I was originally leaning more towards a couple of regex expressions, but if this proves to be easy to implement, I might go for this instead. – elliottbolzan Nov 13 '11 at 01:36
  • None that I knew of. But if you don't want to write (and release) one you could always just build it as a command line tool and run it via NSTask from your app. Piece of cake. Wiki markup into STDIN, HTML from STDOUT. Done. – Regexident Nov 13 '11 at 01:39
  • Updated answer with link to ".leg" file compiler. IIRC you'll need to install peg/leg via `./configure && sudo make && make install` (possibly without the `./configure && `) before attempting to compile kiwi. Oh and yes, I successfully compiled kiwi. Just a week ago. ;) – Regexident Nov 13 '11 at 02:29
  • Alright thanks, got it to build :) Oh and final question: how do I pass the raw wiki markup to the 'parser' found in 'bin'? – elliottbolzan Nov 13 '11 at 02:34
  • [NSTask](http://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSTask_Class/Reference/Reference.html) is your friend. Also check [this](http://developer.apple.com/library/mac/#documentation/Cocoa/Conceptual/OperatingSystem/Tasks/pipes.html#//apple_ref/doc/uid/20000805-BAJEEBAB) and [this](http://borkware.com/quickies/one?topic=nstask). – Regexident Nov 13 '11 at 02:36