4

i am just learning Python and dont have much expierence with Multithreading. I am trying to send some json via the Requests session.post Method. This is called in the function at the bottem of the many for loops i need to run through the dictionary.

Is there a way to let this run in paralell?

I also have to limit my numbers of Threads, otherwise the post calls get blocked because they are to fast after each other. Help would be much appreciated.

def doWork(session, List, RefHashList):
    for itemRefHash in RefHashList:
        for equipment in res['Response']['data']['items']:
            if equipment['itemHash'] == itemRefHash:
                if equipment['characterIndex'] != 0:
                    SendJsonViaSession(session, getCharacterIdFromIndex(res, equipment['characterIndex']), itemRefHash, equipment['quantity'])
Le olde Wire
  • 79
  • 1
  • 8

1 Answers1

3

First, structuring your code differently might improve the speed without the added complexity of threading.

def doWork(session, res, RefHashList):
    for equipment in res['Response']['data']['items']:
        i = equipment['itemHash']
        k = equipment['characterIndex']
        if i in RefHashList and k != 0:
            SendJsonViaSession(session, getCharacterIdFromIndex(res, k), i, equipment['quantity'])

To start with, we will look up equipment['itemHash'] and equipment['characterIndex'] only once.

Instead of explicitly looping over RefHashList, you could use the in operator. This moves the loop into the Python virtual machine, which is faster.

And instead of a nested if-conditional, you could use a single conditional using and.

Note: I have removed the unused parameter List, and replaced it with res. It is generally good practice to write functions that only act on parameters that they are given, not global variables.

Second, how much extra performance do you need? How much time is there on average between the SendJsonViaSession calls, and how small can this this time become before calls get blocked? If the difference between those numbers is small, it is probably not worth to implement a threaded sender.

Third, a design feature of the standard Python implementation is that only one thread at a time can be executing Python bytecode. So it is not certain that threading will improve performance.

Edit:

There are several ways to run stuff in parallel in Python. There is multiprocessing.Pool which uses processes, and multiprocessing.dummy.ThreadPool which uses threads. And from Python 3.2 onwards there is concurrent.futures, which can use processes or threads.

The thing is, neither of them has rate limiting. So you could get blocked for making too many calls. Every time you call SendJsonViaSession you'd have to save the current time somehow so that all processes or threads can use it. And before every call, you would have to read that time and wait if it is too close to the last call.

Edit2:

If a call to SendJsonViaSession only takes 0.3 seconds, you should be able to do 3 calls/second sequentially. But your code only does 1 call/second. This implies that the speed restriction is somewhere else. You'd have to profile your code to see where the problem lies.

Roland Smith
  • 42,427
  • 3
  • 64
  • 94
  • Awesome. Thank you for the quick and informative help. I can make 250 calls in 10 seconds, otherwise the throttle starts and i get blocked. Right now i am sending around 10 calls in 10 seconds. So there maybe would be potential for improvement. – Le olde Wire Jan 27 '17 at 20:56
  • @LeoldeWire And how long does a call to `SendJsonViaSession` actually take? – Roland Smith Jan 27 '17 at 21:37
  • ~0.3 seconds i would say – Le olde Wire Jan 27 '17 at 23:41
  • @LeoldeWire Run your code through a [line profiler](https://pypi.python.org/pypi/line_profiler/). That way you can see which lines actually take the longest time to run. Then you know *what* to optimize – Roland Smith Jan 28 '17 at 09:22