10

I'm working on writing a IRC bot in C, and have ran into a snag.

In my main function, I create my socket and connect, all that happy stuff. Then I have a (almost) infinite loop to read what's being sent back from the server. I then pass what's read off to a helper function, processLine(char *line) - the problem is, that the following code reads until my buffer is full - I want it to only read text until a newline (\n) or carriage return (\r) occurs (thus ending that line)

   while (buffer[0] && buffer[1]) {
        for (i=0;i<BUFSIZE;i++) buffer[i]='\0';
        if (recv(sock, buffer, BUFSIZE, 0) == SOCKET_ERROR)
            processError();

        processLine(buffer);
    }

What ends up happening is that many lines get jammed all together, and I can't process the lines properly when that happens.

If you're not familiar with IRC protocols, a brief summary would be that when a message is sent, it often looks like this: :YourNickName!YourIdent@YourHostName PRIVMSG #someChannel :The rest on from here is the message sent... and a login notice, for instance, is something like this: :the.hostname.of.the.server ### bla some text bla with ### being a code(?) used for processing - i.e. 372 is an indicator that the following text is part of the Message Of The Day.

When it's all jammed together, I can't read what number is for what line because I can't find where a line begins or ends!

I'd appreciate help with this very much!

P.S.: This is being compiled/ran on linux, but I eventually want to port it to windows, so I am making as much of it as I can multi-platform.

P.S.S.: Here's my processLine() code:

void processLine(const char *line) {
    char *buffer, *words[MAX_WORDS], *aPtr;
    char response[100];
    int count = 0, i;
    buffer = strdup(line);

    printf("BLA %s", line);

    while((aPtr = strsep(&buffer, " ")) && count < MAX_WORDS)
        words[count++] = aPtr;
        printf("DEBUG %s\n", words[1]);
    if (strcmp(words[0], "PING") == 0) {
        strcpy(response, "PONG ");
        strcat(response, words[1]);
        sendLine(NULL, response); /* This is a custom function, basically it's a send ALL function */
    } else if (strcmp(words[1], "376") == 0) { /* We got logged in, send login responses (i.e. channel joins) */
        sendLine(NULL, "JOIN #cbot");
    }
}
FurryHead
  • 1,479
  • 3
  • 16
  • 19

2 Answers2

11

The usual way to deal with this is to recv into a persistent buffer in your application, then pull a single line out and process it. Later you can process the remaining lines in the buffer before calling recv again. Keep in mind that the last line in the buffer may only be partially received; you have to deal with this case by re-entering recv to finish the line.

Here's an example (totally untested! also looks for a \n, not \r\n):

#define BUFFER_SIZE 1024
char inbuf[BUFFER_SIZE];
size_t inbuf_used = 0;

/* Final \n is replaced with \0 before calling process_line */
void process_line(char *lineptr);
void input_pump(int fd) {
  size_t inbuf_remain = sizeof(inbuf) - inbuf_used;
  if (inbuf_remain == 0) {
    fprintf(stderr, "Line exceeded buffer length!\n");
    abort();
  }

  ssize_t rv = recv(fd, (void*)&inbuf[inbuf_used], inbuf_remain, MSG_DONTWAIT);
  if (rv == 0) {
    fprintf(stderr, "Connection closed.\n");
    abort();
  }
  if (rv < 0 && errno == EAGAIN) {
    /* no data for now, call back when the socket is readable */
    return;
  }
  if (rv < 0) {
    perror("Connection error");
    abort();
  }
  inbuf_used += rv;

  /* Scan for newlines in the line buffer; we're careful here to deal with embedded \0s
   * an evil server may send, as well as only processing lines that are complete.
   */
  char *line_start = inbuf;
  char *line_end;
  while ( (line_end = (char*)memchr((void*)line_start, '\n', inbuf_used - (line_start - inbuf))))
  {
    *line_end = 0;
    process_line(line_start);
    line_start = line_end + 1;
  }
  /* Shift buffer down so the unprocessed data is at the start */
  inbuf_used -= (line_start - inbuf);
  memmove(innbuf, line_start, inbuf_used);
}
SiegeX
  • 135,741
  • 24
  • 144
  • 154
bdonlan
  • 224,562
  • 31
  • 268
  • 324
  • seems simple enough. How would I re-enter recv(), though? Would I pass a char pointer to the end of the partially read text, i.e. if recv() only read 5 out of 10 chars, pass a pointer to the 6th position instead? – FurryHead May 22 '11 at 20:45
  • @FurryHead: Added an (untested) example – bdonlan May 22 '11 at 20:52
  • 2
    Oh wow. I gave up on this project long ago, feeling like most of this was going right over my head (which, it was). Now I'm finally coming back to an extremely similar project (irc bot again, but a little different) and I read through this without even realizing this was my thread. I've been banging my head against the desk for the past 2 days, trying to implement this (almost exactly what you were writing) but, oddly enough, I ended up with just one character being removed from random positions in the line. Odd. Anyway, just wanted to thank you again! This helped a lot! – FurryHead Jul 29 '11 at 18:10
  • 1
    works great but `(inbuf - line_start);` should be `(line_start - inbuf);` – Behlül Nov 22 '13 at 16:04
7

TCP doesn't offer any sequencing of that sort. As @bdonlan already said you should implement something like:

  • Continuously recv from the socket into a buffer
  • On each recv, check if the bytes received contain an \n
  • If an \n use everything up to that point from the buffer (and clear it)

I don't have a good feeling about this (I read somewhere that you shouldn't mix low-level I/O with stdio I/O) but you might be able to use fdopen.

All you would need to do is

  • use fdopen(3) to associate your socket with a FILE *
  • use setvbuf to tell stdio that you want it line-buffered (_IOLBF) as opposed to the default block-buffered.

At this point you should have effectively moved the work from your hands to stdio. Then you could go on using fgets and the like on the FILE *.

cnicutar
  • 178,505
  • 25
  • 365
  • 392
  • Great idea, I tried it and it works beautifully. I do have two questions about it though: How would I check for errors on windows? Usually, I'd use WSAGetLastError() as windows sockets use that rather than errno... And would fdopen()/setvbuf() work on windows, too? – FurryHead May 22 '11 at 21:36
  • (update, when I try to process errno on linux using this, it gives me an error code of 0 - I don't yet know what that corresponds to) – FurryHead May 22 '11 at 21:40
  • @FurryHead `setvbuf` is standard; Windows has `_fdopen`. About the `errno` part, when using `stdio` check for errors with `ferror`, `feof`. Obviously this presents fewer details than a `recv` or a `read` does. The standard says no function should set `errno` to 0, but I believe it means "success". So even when a `recv` fails, the actual fgets succeeds. – cnicutar May 22 '11 at 21:48
  • Oh, great. That just threw all my error handling code right out the window! Anyway, it was helpful regardless. Thanks! – FurryHead May 22 '11 at 22:14