3

I know this question has been answered in accordance with RFC 1459. But how would one go about using regular expressions to match channels in accordance with RFCs 2811-2813?

RFC 2811 states:

Channels names are strings (beginning with a '&', '#', '+' or '!' character) of length up to fifty (50) characters. Channel names are case insensitive.

Apart from the the requirement that the first character being either '&', '#', '+' or '!' (hereafter called "channel prefix"). The only restriction on a channel name is that it SHALL NOT contain any spaces (' '), a control G (^G or ASCII 7), a comma (',' which is used as a list item separator by the protocol). Also, a colon (':') is used as a delimiter for the channel mask. The exact syntax of a channel name is defined in "IRC Server Protocol" [IRC-SERVER].

And supplementing that, RFC 2812 states:

channel    =  ( "#" / "+" / ( "!" channelid ) / "&" ) chanstring
              [ ":" chanstring ]
chanstring =  %x01-07 / %x08-09 / %x0B-0C / %x0E-1F / %x21-2B
chanstring =/ %x2D-39 / %x3B-FF
                ; any octet except NUL, BELL, CR, LF, " ", "," and ":"
channelid  = 5( %x41-5A / digit )   ; 5( A-Z / 0-9 )
Community
  • 1
  • 1
quentinxs
  • 866
  • 8
  • 22

2 Answers2

2

To show you how to create a composite regex, I'll make a simplified example.

Suppose a channel name can be up to 20 characters, with lowercase letters only. A regex matching this might be:

[#&][a-z]{1,20}

That is, a # or &, followed by 1 to 20 letters. Since the channelid doesn't follow the same pattern, a regex for that might be:

![A-Z0-9]{5}

which is a ! followed by exactly 5 uppercase letters or digits. For a complete regex that matches either of these, you combine them with (...|...), like this:

([#&][a-z]{1,20}|![A-Z0-9]{5})

You can then drop in your slightly more complex regex for the exact channel name pattern you want to match.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • So, for example, I'd write something along the lines of `([#&+][A-Za-z0-9]{1,49}|![A-Z0-5]{5}[A-Za-z0-9]{1,44})` to match only alphanumeric channels of up to length 50 (using RFC)? – quentinxs Feb 24 '12 at 02:49
0

As defined by the RFC2812 (if i'm not mistaken) except it does not handle the max length (50 bytes), it's a python regexp, it's quickly made, but should work properly... :

^(((![A-Z0-9]{5})|([#+&][^\x00\x07\r\n ,:]+))(:[^\x00\x07\r\n ,:]+)?)$

you can visualize the regexp logic here

HTH..

Community
  • 1
  • 1