1

I'm having an issue parsing IRC data. I have working code for it however when the user includes a URL in the message it get messed up and I can't find a way around it. This is my current code.

string message = inputStream.ReadLine();
if (message.Contains("PRIVMSG"))
{
    string[] parms = message.Split(':');
    string userMessage = parms[2];
    return userMessage;
}

Example

:*****!*****@*****.tmi.twitch.tv PRIVMSG #***** :http://www.twitch.tv/

It can't grab the full message because of the http:// part.

Manfred Radlwimmer
  • 13,257
  • 13
  • 53
  • 62
JnDone
  • 39
  • 3

3 Answers3

1

According to the Internet Relay Chat Protocol (RFC 1459), your message (control part at the start) will always start with :, as will your params (if you have any - e.g. chat message).

The easiest way to start would be to separate those two components by simply looking for the first colon that is not at the start of the line.

string example = @":*****!*****@*****.tmi.twitch.tv PRIVMSG #***** :http://www.twitch.tv/";
int indexOfColon = example.IndexOf(':', 1);
if (indexOfColon > 0)
{
    string command = example.SubString(0,indexOfColon);
    string message = example.SubString(indexOfColon +1);
}

Demo: https://dotnetfiddle.net/wBoKlC

With the same concept, you can parse any part of the line. Here for example you could extract the command (:PRIVMSG), username (!*****) and host (@*****.tmi.twitch.tv) simply by understanding the protocol structure and without and unneccessary Split and Join or even RegEx.

So instead of looking for PRIVMSG, you should just parse every line and decide what to do with it later. This line could be troublesome:

if (message.Contains("PRIVMSG"))

Imagine any other command contained that string (Username, Channel or a regular message). It would totally break your code.

Btw: The 'pseudo' BNF for IRC is:

<message>  ::= [':' <prefix> <SPACE> ] <command> <params> <crlf>
<prefix>   ::= <servername> | <nick> [ '!' <user> ] [ '@' <host> ]
<command>  ::= <letter> { <letter> } | <number> <number> <number>
<SPACE>    ::= ' ' { ' ' }
<params>   ::= <SPACE> [ ':' <trailing> | <middle> <params> ]

<middle>   ::= <Any *non-empty* sequence of octets not including SPACE
               or NUL or CR or LF, the first of which may not be ':'>
<trailing> ::= <Any, possibly *empty*, sequence of octets not including
                 NUL or CR or LF>

<crlf>     ::= CR LF
Community
  • 1
  • 1
Manfred Radlwimmer
  • 13,257
  • 13
  • 53
  • 62
  • 1
    +1. I rolled my own IRC handler for a bot years ago for this very reason. It implemented the whole protocol to prevent the unexpected, such as the command appearing elsewhere as you mentioned. – Chris Fannin May 18 '16 at 20:20
0

Try this:

if (message.Contains("PRIVMSG"))
{
    string[] parms = message.Split(':');
    string userMessage = string.Join(":", parms.Skip(2));
    return userMessage;
}
Vyrira
  • 174
  • 8
0

You could use Regex as per this SO answer:

:(?<nick>[^ ]+?)\!(?<user>[^ ]+?)@(?<host>[^ ]+?) PRIVMSG (?<target>[^ ]+?) :(?<message>.*)

The group message will have the link:

enter image description here

Using groups:

   var match = Regex.Match(@":*****!*****@*****.tmi.twitch.tv PRIVMSG #***** :http://www.twitch.tv/",
            @":(?<nick>[^ ]+?)\!(?<user>[^ ]+?)@(?<host>[^ ]+?) PRIVMSG (?<target>[^ ]+?) :(?<message>.*)");
        if (match.Success)
        {
            var message = match.Groups["message"].Value;
        }
    }
Community
  • 1
  • 1
Bruno Garcia
  • 6,029
  • 3
  • 25
  • 38