30

I'm looking at integrating multipart form-data parsing in a web server module so that I can relieve backend web applications (often written in dynamic languages) from parsing the multipart data themselves. The multipart grammar (RFC 2046) looks non-trivial and if I implement it by hand a lot of things can go wrong. Is there already a good, lightweight multipart/form-data parser written in C or C++? I'm looking for one with no external dependencies other than the C or C++ standard library. I don't need email attachment handling or buffered I/O classes or a portability runtime or whatever, just multipart/form-data parsing.

Things that I've considered:

  • GMime - depends on glib, so no go.
  • libapreq - too large, depends on APR, badly documented, no unit tests.

I've also looked at writing a parser with Ragel, but I can't figure out how to do it because the grammar is not static: the boundary can change arbitrarily.

Hongli
  • 18,682
  • 15
  • 79
  • 107
  • "GMime - depends on glib, so no go." - care to explain why? – John Zwinck Feb 15 '10 at 16:36
  • 4
    Have you read this thread: http://stackoverflow.com/questions/218089/simple-c-mime-parser ? – Manuel Feb 15 '10 at 17:05
  • 1
    @John: Every new dependency adds installation hassle for my users, and I want to avoid that to a minimum. Many servers do not have glib installed. Also every new dependency increases resource usage. There aren't many server apps that use glib, so if I depend on glib I'll pull it all of its memory consumption just to parse some mime data. – Hongli Feb 15 '10 at 21:42
  • 3
    Could you use GMime and link statically to avoid installation hassle? I'm not 100% sure, but I suspect you could, and that the memory footprint probably wouldn't be an issue on most servers. – John Zwinck Feb 16 '10 at 00:27
  • Just an FYI, but overhead from glib is tiny. Also, every Linux distro ships with glib by default and many include gmime by default as well. – jstedfast Feb 19 '12 at 21:15

5 Answers5

10

I know this question is a couple of years old now, but I needed the same and ended up using this:

https://github.com/iafonov/multipart-parser-c

James M
  • 18,506
  • 3
  • 48
  • 56
6

Yes, there one. No secret it is my own. Feel free to use it. The link is: MPFDParser. It has no dependencies at all.

1

mimetic claims to support it. I think GNU cgicc may also support it.

Ken Bloom
  • 57,498
  • 14
  • 111
  • 168
1

cgicc supports it... But is written quite badly and relays on entry buffer in memory.

Artyom
  • 31,019
  • 21
  • 127
  • 215
-6

this may not answer your question directly, but did you consider hiphop for php from facebook?

it converts your php code to c++ then compiles with g++.

might save you time in writing something on your own.

hasan
  • 21
  • 4