Here is a link http://semanchuk.com/philip/ with libraries implementing posix and system V semaphores. You can use one of those. Beware though that in a situation when process holding semaphore dies without releasing it - all other get stucked. If you afraid of this - you can use System V Semaphores with UNDO but they are a little slower. Also if you are happen to use System V shared memory primitives - remember that they live in kernel and keep living after process termination - you have to explicitly remove them from system.
If you are not afraid of dying processes and deadlock of whole system and processes are related - you could use python's Semaphores (they are posix named semaphores.)
The page you linked as related question (fcntl) does not saying that fcntl is not suitable for inter thread locking. It is saying that fcntl cares about fds. So you can use fcntl for inter-process and inter-thread locking as long as you open locking file and get new fd for each lock instance.
You could also use a combination of fcntl for inter-process and python's semaphore for inter-thread locking.
And finally: rethink your architecture. Locking is generally bad. Delegate resource to a process that will take care of it without locking. It will be much more simplier to maintain. Believe me.