0

With my current setup, I'm running a server with Django and I'm trying to automate backing up to the cloud whenever a POST/PUT action is made. To circumvent the delay (Ping to server hovers around 100ms and an action can reach upwards of 10 items posted at once), I decided to create a separate entity with a requests client and simply have this handle all backing up functions.

To do this, I have that entity listen via UNX using twisted and I send it a string through it whenever I hit an endpoint. The problem however is that if too many end points get called at once or get called in rapid succession, the data sent over the socket no longer comes in order. Is there any way to prevent this? Code below:

UNX Server:

class BaseUNXServerProtocol(LineOnlyReceiver):

    rest_client = RestClient()

    def connectionMade(self):
        print("UNIX Client connected!")

    def lineReceived(self, line):
        print("Line Received!")

    def dataReceived(self, data):
        string = data.decode("utf-8")
        jstring = json.loads(data)
        if jstring['command'] == "upload_object":
            self.rest_client.upload(jstring['model_name'], jstring['model_id'])

Unix Client:

class BaseUnixClient(object):

path = BRANCH_UNX_PATH
connected = False

def __init__(self):
    self.init_vars()
    self.connect()

def connect(self):
    if os.path.exists(self.path):
        self.client = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
        self.client.connect(self.path)
        self.connected = True
    else:
        print("Could not connect to path: {}".format(self.path))

def call_to_upload(self, model_class, model_id, upload_type):
        self.send_string(_messages.branch_upload_message(model_class, model_id, upload_type))

Endpoint perform_create: (Essentially a hook that gets called whenever a new object is POSTed)

def perform_create(self, serializer):
    instance = serializer.save()

    # Call for upload/notify
    UnixClient().call_to_upload(model_class=type(instance).__name__, model_id=instance.id, upload_type="create")
Lorenzo
  • 133
  • 3
  • 9

1 Answers1

1

SOCK_STREAM connections are always ordered. Data on one connection comes out in the same order it went in (or the connection breaks).

THe only obvious problem with the code you shared is that you shouldn't override dataReceived on a LineOnlyReceiver subclass. All your logic belongs in lineReceived.

That wouldn't cause out-of-order data problems but it could lead to framing issues (like partial JSON messages being processed, or multiple messages being combined) which would probably cause json.loads to raise an exception.

So, to answer your question: data is delivered in order. If you are seeing out-of-order operation, it's because the data is being sent in a different order than you expect or because there is a divergence between the order of data delivery and the order of observable side-effects. I don't see any way to provide a further diagnosis without seeing more of your code.

Seeing your sending code, the problem is that you're using a new connection for every perform_create operation. There is no guarantee about delivery order across different connections. Even if your program does:

  • establish connection a
  • send data on connection a
  • establish connection b
  • send data on connection b
  • close connection a
  • close connection b

The receiver may decide to process data on connection b before data on connection a. This is because the underlying event notification system (select, epoll_wait, etc) doesn't (ever, as far as I know) preserve information about the ordering of the events it is reporting on. Instead, results come out in a pseudo-random order or a boring deterministic order (such as ascending by file descriptor number).

To fix your ordering problem, make one UnixClient and use it for all of your perform_create calls.

Jean-Paul Calderone
  • 47,755
  • 6
  • 94
  • 122
  • Hey Jean! Thanks for the tip! My problem however has to do with data being sent in the right order but received in a jumbled order. I can assure you that it is being sent right in that `perform_create` gets called every time an endpoint is hit. Calling the same endpoint multiple times should result in sending the IDs in order but alas, such is not the case. On a side note, having messages combined has been a minor issue for a while now. One that I had to circumvent via a little parser so that helped. Thanks! – Lorenzo Feb 23 '17 at 03:27