3

I’m using a project where each time you double the number of threads, you add between 40% to 60% overhead. As hyperthreading increases performance to a maximum of 30% this means, the program runs slower than in single thread mode on hyperthreaded systems.

The first steps seem to be simple.

  • Get the number of threads on the system through len(os.sched_getaffinity(0))

  • Restrict the number of threads through z3 parameters.

  • Bind the threads to physical cores using os.sched_setaffinity(0,mask).

  • Leave smt solutions enabled for systems not containing Intel or amd inside platform.machine().

However several problems arise for doing this.

  • How to know if the system has hyperthreading enabled?

  • Before using os.sched_setaffinity(0,mask), how to know which cpu core numbers are physical or logical?

The problem is the program currently supports a wide number of platforms through python3: all Unixes, as well as Windows and Osx and Openvms while not forgetting PyPy.

Any patch to fix the problem shouldn’t spawn a new process nor add a non-included dependency nor drop support for some of the platforms above.

What can be a clean way to fix this?

user2284570
  • 2,891
  • 3
  • 26
  • 74
  • See [QtCore.QThread.idealThreadCount()](https://doc.qt.io/qtforpython/PySide2/QtCore/QThread.html#PySide2.QtCore.PySide2.QtCore.QThread.idealThreadCount) – Vladimir Bershov Apr 28 '20 at 15:48
  • @VladimirBershov which isn’t supported on Openvms, so not cross platform. I’m thinking the solution should be pure python based in the current case. – user2284570 Apr 28 '20 at 16:25
  • @PatrickTrentin but this is a Python question! Not C++. – user2284570 Apr 30 '20 at 12:43
  • @PatrickTrentin the project itself isn t using binary code currently. I am doubting they would accept such a patch. Also performance wise, this would be a problem for PyPy. – user2284570 Apr 30 '20 at 13:54
  • @user2284570 have you considered `psutil.cpu_count(logical=False)`, or is it not supported by some of your platforms? – Patrick Trentin May 01 '20 at 20:15
  • @PatrickTrentin I don’t think they would accept an additional dependency just for that as they already tend favor single‑thread mode in general (until a huge proper rewrite without overhead is done using https://www.python.org/dev/peps/pep-0554/ which would still require to serialize C objects which can’t be serialized). And Openᴠᴍꜱ isn’t supported. As a general rule, every supported platform can be considered listed in the question. – user2284570 May 01 '20 at 20:37
  • @PatrickTrentin I’m not aware of process control related dependencies. However, if you try `pip install mythril` you’ll see many dependencies. Most of the time things like dependencies of dependencies of dependencies. This project really requires a lot of packages. – user2284570 May 01 '20 at 21:43
  • @PatrickTrentin that’s a dependency hell. I’ve no idea on how to generate the full list. That’s why I stated it would be better the solution should only use the basic set of libraries that exists with all versions of Python (though I didn’t tell it was about python3.6) in an earlier comment. – user2284570 May 01 '20 at 22:18
  • I cleaned some of my comments that are not useful to clarify the question. I don't think that I might be able to help at this point. Best of luck! – Patrick Trentin May 01 '20 at 22:58

1 Answers1

-1

The loky library contains a fairly portable solution to this. It does spawn a process, and then caches the result -- so it's not like you're spawning a process more than once. Given this is the solution which backs popular libraries like sklearn, I would guess that it's almost as good as it gets.

  • It uses lscpu under the woods. An utility which don t exist on Openvms and Windows. – user2284570 Dec 26 '20 at 13:09
  • It has an alternative for windows. I'm sure you can add another elif to deal with openvms. If you find a better solution, you should definitely look at contributing it back to loky. – Frankie Robertson Dec 26 '20 at 14:32