Working With Unix Domain Sockets In Python - How to possibly loop through lines coming from socket and perform actions on those lines?

Question

Essentially I started using sockets because of how fast and efficient they are. Basically I have a program in python that parses the lines coming from the sockets but this program works on a .txt file. I'm trying to figure out a way to incorporate my program but using it on the sockets. The code is below.

#!/bin/python

import socket
import os, os.path

if os.path.exists("/home/log/socket"):
  os.remove("/home/log/socket")

log = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
log.bind("/home/log/socket")

while True:
  print(log.recv(4096))

This is what I'm using to receive the socket. My main concern is I have to parse the data I'm getting from the socket before I upload it to my DB.

Here is my python parse program.

import threading

def text_org():   #fuction, if you want to run this, make sure you call the fuction!
    threading.Timer(300.0, text_org).start()


    infile = open('/home/log/radius.log', 'r') #file to operate on
    outfile = open('current_logs_sql_ready.txt', 'w')       #ending file, change name here to change the final file name
    error_count = 0
    time_count = 0

    for l in infile:                           #loops through each line in the file
        tokens = l.split()                      #counter

        if len(tokens) >19:                     #checks to make sure each line in valid in the .log file
            outfile.write(l.split()[+0] +'-') #Gets Day
            outfile.write(l.split()[+1] +'-') #Gets Month
            outfile.write(l.split()[+3] + ',')  # Gets Year
            outfile.write(l.split()[+2] + ',')  # Gets Time
            outfile.write(l.split()[-2] + ',')  # Gets Device
            outfile.write(l.split() [+9] + ',')  # Gets ID
            outfile.write(l.split()[+18] + ',')  # Gets AP
            outfile.write(l.split()[+19] + ',')  # Gets AP Group
            outfile.write(l.split()[+16] + '\n')  # Gets MAC
            time_count +=1


        else:                                      #this happens when a line in the file is invalid
            #outfile.write(l.split()[] + 'ERROR \n')  # Gets MAC
            print("Invalid Line \n")
            error_count +=1


    #print(error_count)
    #print('times ran =') +(time_count)


    infile.close()
    outfile.close()

text_org()    #runs the program

Essentially i'd like to use my parse program with the socket instead of a .txt file. Thanks for the help!

score 1 · Answer 1 · answered Nov 29 '18 at 17:50

You have several options here.

The easiest is to simply take your existing text_org function and break out the "parse-a-single-line" part and put it in a separate function. Refactored, your code would look like:

def parse_line(outfile, line):
    tokens = line.split()
    outfile.write(...)
    ...

def text_org():
    ...
    with open('/home/log/radius.log', 'r') as infile:
        with open('current_logs_sql_ready.txt', 'w') as outfile:
            for l in infile:
                parse_line(outfile, l)

Then, from your socket handler:

log = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
log.bind("/home/log/socket")

with open('current_logs_sql_ready.txt', 'w') as outfile:
    while True:
        line = log.recv(4096).decode()
        parse_line(outfile, line)

If the socket sender already delivers lines terminated by newlines, you can convert the socket directly into a python file-like object with makefile as Daniel Pryden's answer explains. (But the standard file object will expect to find "lines" and will keep attempting to read as long as it doesn't find the end-of-line character.)

If it does not provide newlines (as with, say, a standard syslog sender), you could create your own sub-class of socket that provides similar behavior with record-at-a-time iterator semantics:

class mysock(socket.socket):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
    def __iter__(self):
        return self
    def __next__(self):
        return self.recv(4096).decode()

log = mysock(socket.AF_UNIX, socket.SOCK_DGRAM)
log.bind("/home/log/socket")   

with open('current_logs_sql_ready.txt', 'w') as outfile:
    for line in log:
        parse_line(outfile, line)

The goal here was to prevent making a new text file, as soon as I got the lined parsed from the socket, I was going to immediately insert that into a db table. Is there anyway to do what I want just in a loop instead of just writing it to a new text file? Could I get the line in the socket, parse it without having to deal with an outfile? — Tanner A, Nov 29 '18 at 19:06
Sure. Following the `tokens = line.split()` call, `tokens` is a list of "words". Do whatever you like with them in lieu of writing them to the file. In other words, you can simply change what `parse_line` does (pass it a DB connection instead of an output file, for example, and have it do an INSERT). — Gil Hamilton, Nov 29 '18 at 19:22
Thanks I think I understand. Last question is once I call the parse_line function, in my current program I do outfile.write(line.split()[+5], what can I replace outfile with? Would it be a good idea to initialize an array and have the function spit the splits out to the array and then insert the array into the DB? So for example could I do array = [] : then do array.write(line.split........) instead out outfile.write? — Tanner A, Nov 29 '18 at 19:40
Yes. You need to learn your DB interface as a first step (and set up the DB). That will determine exactly how you do it. With `psycopg2` (a PostgreSQL interface library) for example, I'd do something like `db_cursor.execute('INSERT INTO mytable (x, y, z ...) VALUES (%s, %s, %s ...)', tokens)`. But you'll need to connect to the database and create the cursor first. — Gil Hamilton, Nov 29 '18 at 20:09

Working With Unix Domain Sockets In Python - How to possibly loop through lines coming from socket and perform actions on those lines?

1 Answers1