1

I am writing a C CGI program

For GET requests, I assume all the information is somehow stored in getenv(). My question is, what does this array look like a most basic CGI request from the webserver. With two prarameters e.g. username= and password=.

For POST request, I am unsure. I've read that stuff is handled on standard input. What do these lines that are passed to a CGI program via standard input (from the webserver) look like?

Pointing me to a verbose RFC is unhelpful.

Any book? I am specifically interested in the low-level details of the protocol. I already know how to write CGI apps with helper libs... I just need to know the semantics of those helper libs.

unixman83
  • 9,421
  • 10
  • 68
  • 102

1 Answers1

3

envp is not standard (well, not ISO C or C++ standard anyway, though POSIX may have something for it).

However, envp is pretty much the same format as argv escept it doesn't have a controlling argc to limit it.

Each envp[x] will be of the form "key=value" where the key is the environment variable name and the value is its value, surprisingly enough :-)

You should process the elements sequentially until you get a NULL pointer, something like:

#include <stdio.h>
int main (int argc, char *argv[], char *envp[]) {
    int i = 0;
    while (envp[i] != NULL)
        printf ("[%s]\n", envp[i++]);
    return 0;
}

The Wikipedia entry for CGI gives further details, hopefully without swamping you with too much information like a verbose RFC would.

Copying the relevant stuff to make this answer self-contained:

  • Server specific variables:
    • SERVER_SOFTWARE — name/version of HTTP server.
    • SERVER_NAME — host name of the server, may be dot-decimal IP address.
    • GATEWAY_INTERFACE — CGI/version.
  • Request specific variables:
    • SERVER_PROTOCOL — HTTP/version.
    • SERVER_PORT — TCP port (decimal).
    • REQUEST_METHOD — name of HTTP method (see above).
    • PATH_INFO — path suffix, if appended to URL after program name and a slash.
    • PATH_TRANSLATED — corresponding full path as supposed by server, if PATH_INFO is present.
    • SCRIPT_NAME — relative path to the program, like /cgi-bin/script.cgi.
    • QUERY_STRING — the part of URL after ? character. May be composed of *name=value pairs separated with ampersands (such as var1=val1&var2=val2…) when used to submit form data transferred via GET method as defined by HTML application/x-www-form-urlencoded.
    • REMOTE_HOST — host name of the client, unset if server did not perform such lookup.
    • REMOTE_ADDR — IP address of the client (dot-decimal).
    • AUTH_TYPE — identification type, if applicable.
    • REMOTE_USER used for certain AUTH_TYPEs.
    • REMOTE_IDENT — see ident, only if server performed such lookup.
    • CONTENT_TYPE — MIME type of input data if PUT or POST method are used, as provided via HTTP header.
    • CONTENT_LENGTH — similarly, size of input data (decimal, in octets) if provided via HTTP header.
    • Variables passed by user agent (HTTP_ACCEPT, HTTP_ACCEPT_LANGUAGE, HTTP_USER_AGENT, HTTP_COOKIE and possibly others) contain values of corresponding HTTP headers and therefore have the same sense.

Beyond that level of detail, you're probably going to have to look into the RFCs, I'm afraid. A search for RFC3875 on Google should locate it.

Specifically, for POST, the environment variables are included before the first blank line of the request (the one that introduces the message body). They have the form:

Content-Type: application/wonderful_app_by_pax
Content-Length: 314159

where the key is case-insensitive and the value follows the colon.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • That **is** the CGI protocol. – Ken White May 21 '11 at 02:16
  • 1
    @unixman83, I've updated the answer with more specifics. _More_ specifics than that will probably turn this answer into a copy of the "verbose" RFC so you should probably reference that if you want greater detail :-) – paxdiablo May 21 '11 at 02:25
  • 1
    `extern char **environ;` is the correct, portable (POSIX) way to access the environment variables as an array. `envp` is not. BTW you can just as easily use `getenv`, which is pure C, if you know the variable names you want. – R.. GitHub STOP HELPING ICE May 21 '11 at 02:59