8

What's the best way to parse messages received from an IRC server with Python according to the RFC? I simply want some kind of list/whatever, for example:

:test!~test@test.com PRIVMSG #channel :Hi!

becomes this:

{ "sender" : "test!~test@test.com", "target" : "#channel", "message" : "Hi!" }

And so on?

(Edit: I want to parse IRC messages in general, not just PRIVMSG's)

Chris Dale
  • 2,222
  • 2
  • 26
  • 39

5 Answers5

22

Look at Twisted's implementation http://twistedmatrix.com/

Unfortunately I'm out of time, maybe someone else can paste it here for you.

Edit

Well I'm back, and strangely no one has pasted it yet so here it is:

http://twistedmatrix.com/trac/browser/trunk/twisted/words/protocols/irc.py#54

def parsemsg(s):
    """Breaks a message from an IRC server into its prefix, command, and arguments.
    """
    prefix = ''
    trailing = []
    if not s:
       raise IRCBadMessage("Empty line.")
    if s[0] == ':':
        prefix, s = s[1:].split(' ', 1)
    if s.find(' :') != -1:
        s, trailing = s.split(' :', 1)
        args = s.split()
        args.append(trailing)
    else:
        args = s.split()
    command = args.pop(0)
    return prefix, command, args

parsemsg(":test!~test@test.com PRIVMSG #channel :Hi!")
# ('test!~test@test.com', 'PRIVMSG', ['#channel', 'Hi!']) 

This function closely follows the EBNF described in the IRC RFC.

Community
  • 1
  • 1
Unknown
  • 45,913
  • 27
  • 138
  • 182
  • @earthmeLon, which is updated by [RFC2812](https://tools.ietf.org/html/rfc2812#section-2.3). – SunSparc Apr 14 '17 at 16:37
  • Note that this works conveniently for valid input (nice!), but also works for invalid input (e.g. incorrect prefix values, such as ':@server CMD params'), and makes no effort to validate full formatting scope per https://tools.ietf.org/html/rfc2812#section-2.3.1 – Kaa Jul 21 '19 at 04:46
1

You can do it with a simple list comprehension if the format is always like this.

keys = ['sender', 'type', 'target', 'message']
s = ":test!~test@test.com PRIVMSG #channel :Hi!"
dict((key, value.lstrip(':')) for key, value in zip(keys, s.split()))

Result:

{'message': 'Hi!', 'type': 'PRIVMSG', 'sender': 'test!~test@test.com', 'target': '#channel'}
Nadia Alramli
  • 111,714
  • 37
  • 173
  • 152
  • 1
    Messages do not always follow this format, and can have greater or fewer parts. [RFC1459 2.3](https://tools.ietf.org/html/rfc1459#section-2.3) – earthmeLon Jul 19 '15 at 23:29
0

I know it's not Python, but for a regular expression-based approach to this problem, you could take a look at POE::Filter::IRCD, which handles IRC server protocol (see POE::Filter::IRC::Compat for the client protocol additions) parsing for Perl's POE::Component::IRC framework.

Hinrik
  • 819
  • 9
  • 14
0

Do you just want to parse IRC Messages in general or do you want just parse PRIVMSGs? However I have a implementation for that.

def parse_message(s):
    prefix   = ''
    trailing = ''
    if s.startswith(':'):
        prefix, s = s[1:].split(' ', 1)
    if ' :' in s:
        s, trailing = s.split(' :', 1)
    args = s.split()
    return prefix, args.pop(0), args, trailing
DasIch
  • 2,549
  • 1
  • 15
  • 23
  • Why not use list comprehension? map(lambda a:a.lstrip(':'), s.split()) – Nadia Alramli May 30 '09 at 23:02
  • Prefix and trailing is optional. The trailing may contain spaces and a message can have a couple of parameters. btw. map has nothing to do with a list comprehension which would look like `[part.lstrip(';') for part in s.split()]` – DasIch May 31 '09 at 20:56
0

If you want to keep to a low-level hacking I second the Twisted answer by Unknown, but first I think you should take a look at the very recently announced Yardbird which is a nice request parsing layer on top of Twisted. It lets you use something similar to Django URL dispatching for handling IRC messages with a side benefit of having the Django ORM available for generating responses, etc.

Van Gale
  • 43,536
  • 9
  • 71
  • 81