0

I have set up locally Redis and I want to connect to a remote server that provides a synthetic data stream in the form < ID, value >. I have so far managed to connect to the server mentioned above using sockets, read the data stream and just print it. Instead I want to store the pairs in a hash data structure (I'm going to store more informations about each ID later). The problem is that I don't know how to parse the data stream in order to use hget and how to it continuously. In a higher level I would like to be able to pass Name and Value from the incoming data stream as arguments to hget. Forgot to mention I'm using Python API. So far:

import socket
client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client_socket.connect(('xx.xx.xx.xxx', 1337))
while 1:
        data = client_socket.recv(512)
        print data

A sample of the data stream:

'AMZN,780.6758\n'
'TSLA,197.1802\n'
'CSCO,29.7491\n'
'GOOG,761.3758\n'
'AAPL,112.4122\n'
'GRPN,4.5848\n'
'FB,121.1232\n'
'MSFT,60.3529\n'
'INTC,35.9056\n'
'NVDA,94.473\n'
'QCOM,68.7389\n'
'AMZN,780.6761\n'
'TSLA,197.1798\n'
'CSCO,29.7486\n'
'GOOG,761.3755\n'
'AAPL,112.4122\n'
'GRPN,4.5848\n'
'FB,121.1237\n'
'MSFT,60.353\n'
'INTC,35.9054\n'
'NVDA,94.473\n'
'QCOM,68.7391\n'

I'm not sure if there is a guarantee that all lines are fully formatted but let's that they are.

Mewtwo
  • 1,231
  • 2
  • 18
  • 38
  • what's wrong with `hset` ? https://redis-py.readthedocs.io/en/latest/#redis.StrictRedis.hset - https://redis.io/commands/hset – bruno desthuilliers Jan 02 '17 at 12:25
  • @brunodesthuilliers The question may be too basic but I can't figure out how to refer to the ID and value of the data stream while using `hset`. hset(hash1, user,1) seems of course easy but how to parse data that change continuously? – Mewtwo Jan 02 '17 at 12:42
  • First please edit your question to explain what your real problem is - it's obviously not about redis at all but about how to parse your incoming data. Then add examples of your incoming data _and_ what they should look like once parsed. – bruno desthuilliers Jan 02 '17 at 12:48
  • @brunodesthuilliers edited! I hope it is more clear now – Mewtwo Jan 02 '17 at 13:00
  • Could you post a __text__ representation of your data (not a picture), ie "print repr(data)" and copy/paste ? (or write to a text file and copy-paste etc). Also, does your server garantee that you'll always get full lines or can `data` contain incomplete data ? Is there a documented marker that signals the end of a "fully formed" pair ? Well, is there any kind of usable doc ? – bruno desthuilliers Jan 02 '17 at 13:20
  • Is your sample a single call to `client_socket.recv()` ? Or is each line the result of a distinct call ? – bruno desthuilliers Jan 02 '17 at 13:40
  • A single call. It is a data stream. Data is generated continuously so it prints non stop. – Mewtwo Jan 02 '17 at 13:54

1 Answers1

1

Parsing a single non-empty line into a key/value pair is as simple as :

key, value = line.strip().split(",", 1)

Assuming your data may be incomplete (unterminated record) and that it's the linefeed that marks the end of a record, you can store incomplete records in a buffer and add them back before parsing, so your function might look something like this:

 def run(client_socket):
    buffer = ""
    while True:
        data = client_socket.recv(512)
        # not sure the following lines makes sense - 
        # you may actually want to handle exceptions 
        # or whatever
        if not data:
            break

        # add the buffer back 
        data = buffer + data
        # split on newlines
        lines = data.splitlines()
        # check if we have an incomplete record
        # (if it doesn't end with a newline)
        if data[-1] !=  '\n':
            # incomplete record, store it back so
            # we process it next time
            buffer = lines.pop()
        else:
            # all records complete for this call, 
            # empty the buffer for next turn
            buffer = "" 

        # now handle our records:    
        for line in filter(None, lines):
            k, v = line.split(",", 1)
            do_something_with(k, v)

The implementation of do_something_with(k, v) is left as an exercise to the reader.

bruno desthuilliers
  • 75,974
  • 6
  • 88
  • 118