25

Does Python have the equivalent of the Java volatile concept?

In Java there is a keyword volatile. As far as I know, when we use volatile while declaring a variable, any change to the value of that variable will be visible to all threads running at the same time.

I wanted to know if there is something similar in Python, so that when the value of a variable is changed in a function, its value will be visible to all threads running at the same time.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Pablo
  • 465
  • 1
  • 4
  • 14

2 Answers2

46

As far as I know, when we use volatile while declaring a variable, any change to the value of that variable will be visible to all threads running at the same time.

volatile is a little more nuanced than that. volatile ensures that Java stores and updates the variable value in main memory. Without volatile, the JVM is free to store the value in the CPU cache instead, which has the side effect of updates to the value being invisible to different threads running on different CPU cores (threads that are being run concurrently on the same core would see the value).

Python doesn't ever do this. Python stores all objects on a heap, in main memory. Moreover, due to how the Python interpreter loop uses locking (the GIL), only one thread at a time will be actively running Python code. There is never a chance that different threads are running a Python interpreter loop on a different CPU.

So you don't need to use volatile in Python, there is no such concept and you don't need to worry about it.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • 3
    Interesting. Does it make Python slower for some multi-threaded applications? – Dici Dec 14 '18 at 13:23
  • Isn't the GIL just an implementation artifact of CPython? – Mark Rotteveel Dec 14 '18 at 13:26
  • 1
    @Dici: yes, CPython can't run Python code in parallel, only concurrently. (Native code spawned from Python is not limited that way, so numpy and other extensions are not bound by this restriction). – Martijn Pieters Dec 14 '18 at 13:30
  • 3
    @MarkRotteveel: yes, but IronPython and Jython and PyPy also don't have `volatile` because they too just use objects on a heap. – Martijn Pieters Dec 14 '18 at 13:31
  • I'm annoying. What difference do you make between running code in parallel and running it concurrently? Do you mean it's parallel as long as no shared data is modified? – Dici Dec 14 '18 at 13:34
  • 3
    @Dici: switching rapidly between threads on a single CPU means the threads run *concurrently*. Running two threads at the same time on two different CPU cores means they run in parallel. Python can only do the former. – Martijn Pieters Dec 14 '18 at 14:11
  • @Dici: Concurrency is a broader concept that includes parallel execution, but does covers other forms of concurrency. – Martijn Pieters Dec 14 '18 at 14:12
  • @MartijnPieters what about read-modify-write operations such as `a = a + 1` in python? As far as I know they can cause a race condition if we don't use a lock. Thread A reads the value of `a` and it is `5`, but doesn't get to modify it and write it back to `a` for example, then python can give control to Thread B, it also reads `5`, since Thread A didn't modify it, now both threads have read `5`, then Thread B continues, it adds `1` to `5`, and writes `6` to `a`. At the point Thread A continues to work, it previously read `5`, now it adds `1` to it and writes `6` back to `a`. – pavel_orekhov Oct 20 '22 at 00:19
  • @MartijnPieters does this mean, that locking a code block that does `a = a + 1` guarantee visibility of Thread B's update when we continue with Thread A somehow? How does it work? – pavel_orekhov Oct 20 '22 at 00:21
  • @pavel_orekhov I think you're correct i.e. updating variable `a` in multiple threads without locking might (will) lead to incorrect behavior, but it's not due to an issue with CPU cache / memory visibility but rather due to lack of atomicity. As a comparison, Java's `volatile` also will not guarantee atomicity, so when you'll run `volatile int a; a = a + 1` within multiple threads similar problem will occur. – t3h_b0t Nov 15 '22 at 08:57
  • @pavel_orekhov: anything that requires multiple _byte code operations_, or *opcodes*, is subject to race conditions. `a = a + 1` requires two opcodes, and the `+` operator can be overloaded in Python code so can trigger code paths with many more opcodes. If in doubt, check with the [Python bytecode disassembler](https://docs.python.org/3/library/dis.html) how many opcodes expressions use, but take into account any [hooks that could be invoked](https://docs.python.org/3/reference/datamodel.html#special-method-names). – Martijn Pieters Nov 25 '22 at 12:41
-5

The keyword "global" is what you are looking for:

import threading

queue = []
l = threading.Lock()

def f():
    global queue
    l.acquire()
    queue.append(1)
    l.release()

def g():
    print(queue)

threads = [
    threading.Thread(target=f),
    threading.Thread(target=g)
]
for t in threads:
    t.start()
for t in threads:
    t.join()
Oyono
  • 377
  • 1
  • 8
  • Thanks, how about `queue = []` in your code? Since it's declared outside all functions, isn't it also considered a global variable? – Pablo Dec 14 '18 at 13:16
  • The keyword `global` is there to allow you to modify a variable bounded elsewhere in the program. If you don't do this, you will only have `read` access to it – Oyono Dec 14 '18 at 13:29
  • 4
    This has nothing to do with `global`. The issue the OP is talking about would apply equally to attributes of instances, which are visible in Python just fine, and can be made `public` in Java. – Martijn Pieters Dec 14 '18 at 13:32
  • 2
    Moreover, because you *never attempt to bind to the `queue` name in your code*, the `global queue` statement is entirely redundant. It can be removed from your example with no difference in behaviour. – Martijn Pieters Dec 14 '18 at 13:33