I am trying to read sensor measurements (published from another device) with MQTT and store the reads of a week in a pandas DataFrame, once such dataframe is empty I would like to save it to a .csv file and start filling a new empty dataframe. An example of such dataframe is as follows:
sensor1 ... sensorxx
timestamp ...
2018-11-21 15:15:00-06 0.276 ... 0
2018-11-21 15:30:00-06 0.167 ... 0
2018-11-21 15:45:00-06 0.179 ... 0.1
2018-11-21 16:00:00-06 0.076 ... 0.2
2018-11-21 16:15:00-06 0.064 ... 0
My code works exactly as I intend it to work, only to fail (doesn't really fail, keeps running without any error message) after a while (hundreds of messages) as if the messages were not flowing in anymore (which they are).
All of this happens within a class, here is a simplified version of my code
import pandas as pd
import json
import paho.mqtt.client as mqtt
global bufferDF = None
global counter = 1
class DataSaver():
def __init__(self,filesfolderpath,sensorslist):
self.filesfolderpath = filesfolderpath
self.sensorslist = sensorslist
self.client = None
def SaveSensorRead(self, client, userdata, message):
global bufferDF
global counter
message_dict = json.loads(message)
timestamp = pd.to_datetime(message_dict["timestamp"]) #timestamp message payload
sensorname = message_dict["sensorname"]
read = message_dict["read"]
# creates an empty dataframe over a weekly daterange containg current timestamp
#(only for the first call when bufferDF has never been initialized)
if (bufferDF is None):
daterange = InitDateRange(timestamp)
bufferDF = pd.DataFrame(index=daterange, columns=self.sensorslist)
# checks wether bufferDF is full, if so saves to disk and initializes new one
if (timestamp > max(bufferDF.index)):
filename = "week"+str(counter)+".csv"
bufferDF.to_csv(os.path.join(self.filesfolderpath,filename))
daterange = InitDateRange(timestamp)
bufferDF = pd.DataFrame(index=daterange, columns=self.sensorslist)
counter += 1
bufferDF.loc[timestamp,sensorname] = read
def InitComm(self, brokerip, channelname)
self.client = mqtt.Client("client")
self.client.on_message = self.SaveSensorRead
self.client.connect(brokerip,1883)
self.client.loop_start()
self.client.subscribe(channelname)
saver = DataSaver(filesfolderpath,sensorslist)
saver.InitComm(brokerip, channelname)
Tried several things. Saving the dataframe at every iteration I could see it gets initialized with the proper structure and filled properly. Tried reducing the frequency of the publisher of the data to several seconds for the subscriber to keep up as suggested here and increasing the quality of service parameter but didn't work.
It's like some memory fills up and my client can't process any more of such messages after a while. One of the weekly files I am trying to save is about 1.5 MB so not really a RAM problem. Trying to see Paho documentation for a "cache" parameter to tune but can't seem to find it.
I could of course reduce the size of the DF so to have it filled by fewer message but doesn't work for me going forward.
Any help is much appreciated!