3

I read the spinlock function code in the linux kernel. There are two functions related to spinlock. See the code below:

static __always_inline void __ticket_spin_lock(raw_spinlock_t *lock)
{
    short inc = 0x0100;

    asm volatile (
        LOCK_PREFIX "xaddw %w0, %1\n"
        "1:\t"
        "cmpb %h0, %b0\n\t"
        "je 2f\n\t"
        "rep ; nop\n\t"
        "movb %1, %b0\n\t"
        /* don't need lfence here, because loads are in-order */
        "jmp 1b\n"
        "2:"
        : "+Q" (inc), "+m" (lock->slock)
        :
        : "memory", "cc");
}
static __always_inline void __ticket_spin_lock(raw_spinlock_t *lock)
{
    int inc = 0x00010000;
    int tmp;

    asm volatile(LOCK_PREFIX "xaddl %0, %1\n"
             "movzwl %w0, %2\n\t"
             "shrl $16, %0\n\t"
             "1:\t"
             "cmpl %0, %2\n\t"
             "je 2f\n\t"
             "rep ; nop\n\t"
             "movzwl %1, %2\n\t"
             /* don't need lfence here, because loads are in-order */
             "jmp 1b\n"
             "2:"
             : "+r" (inc), "+m" (lock->slock), "=&r" (tmp)
             :
             : "memory", "cc");
}

I have two question:

1.What's the difference between the two functions above?

2.What can I do to monitor the spinlock waiting time(the time it takes to first try the lock and finally get the lock)?does the variable inc means the spinlock waiting time?

Charles0429
  • 1,406
  • 5
  • 15
  • 31
  • 1) Not much. Looks like they're just using different values to indicate the "locked" state. 2) You could add a counter to the code. As it stands, there isn't anything that counts the number of repetitions. However, you could also surround your locking code with `rdtsc` cycle counts for a rough measure. – Kerrek SB Jul 01 '13 at 08:43
  • @KerrekSB Thank you. Does the line "rep ; nop\n\t" means spin busy waiting? – Charles0429 Jul 01 '13 at 08:50
  • No - that's just a "yield". Modern processors have a dedicated `pause` instruction. The spinning is the `jmp 1`. (Maybe check out [this article](http://wiki.osdev.org/Spinlock).) – Kerrek SB Jul 01 '13 at 09:02
  • @KerrekSB, Could you help me add a unsigned long counter(monitor the spinning time ) in the asm code, I'm not familiar with the asm code. Thank you very much. – Charles0429 Jul 01 '13 at 09:33

1 Answers1

2

Let me first explain how the spinlock code works. We have variables

uint16_t inc = 0x0100,
         lock->slock;     // I'll just call this "slock"

In the assembler code, inc is referred to as %0 and slock as %1. Moreover, %b0 denotes the lower 8 bit, i.e. inc % 0x100, and %h0 is inc / 0x100.

Now:

lock xaddw %w0, %1    ;; "inc := slock"  and  "slock := inc + slock"
                      ;; simultaneously (atomic exchange and increment)
1:
    cmpb %h0, %b0     ;; "if (inc / 256 == inc % 256)"
    je 2f             ;; "    goto 2;"
    rep ; nop         ;; "yield();"
    movb %1, %b0      ;; "inc = slock;"
    jmp 1b            ;; "goto 1;"
2:

Comparing the upper and lower byte of inc succeeds if inc is zero. Since inc has the value of the original lock, this happens if the lock is unlocked. In that case, the lock will already have been in­cre­ment­ed to non-zero by the atomic exchange-and-increment, so it is now locked.

Otherwise, i.e. if the lock had already been locked, we pause a little, then update inc to the current value of the lock, and try again.

(I believe there's actually a possiblity for an overflow, if 28 threads simultaneously attempt to get the spinlock. In that case, slock is updated to 0x0100, 0x0200, ... 0xFF00, 0x0000, and would then appear to be unlocked. Maybe that's why the second version of the code uses a 16-bit wide counter, which would require 216 simultaneous attempts.)

Now let's insert a counter:

uint32_t spincounter = 0;

asm volatile( /* code below */
    : "+Q" (inc), "+m" (lock->slock)
    : "=r" (spincounter)
    : "memory", "cc");

Now spincounter may be referred to as %2. We just need to increment the counter each time:

1:
    inc %2
    cmpb %h0, %b0
    ;; etc etc

I haven't tested this, but that's the general idea.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • Yes, the first code only works with <= 255 CPUS. That's why there is the different versions of the low-level spinlock asm code using 16-bit counters, and there is a CPP test "#if (CONFIG_NR_CPUS < 256)" in the source to decide which to use at compile time. – Roland Jul 01 '13 at 17:16
  • @Roland: Thanks! That's great to know. (I forgot that "threads" isn't as useful a concept in kernel space as "CPUs" is.) – Kerrek SB Jul 01 '13 at 18:10
  • @Kerrek SB.Thank you very much. I tried your code in the linux kernel, but it didn't give any output information about the spinlock waiting time information. Could you tell me where am wrong? My computer's architecture is x86_64 and my os linux-2.6.31.10. – Charles0429 Jul 04 '13 at 02:44
  • @Charles0429: Did you log the value of the `spincounter` variable? What does it say? – Kerrek SB Jul 04 '13 at 08:41
  • @KerrekSB. Yes, I tried to log the spincounter variable from the virtual machine to the xen hypervisor, but it didn't generate any output. And I tried to write a illegal statement in the file which contains ticket_spinlock, to my surprise, no errors encounter, So I wonder ticket spinlock is used in linux-kernel 2.6.31.10? although what i googled is "ticket spinlock is used by x86 architecture since linux kenerl 2.6.25". – Charles0429 Jul 04 '13 at 14:08
  • @KerrekSB: I also tried to use printk function to log the variable spincounter, it didn't output either. Could you give me some advice on what can i do next? – Charles0429 Jul 04 '13 at 14:29
  • @Charles0429: Hm, sorry, I'm not very familiar with kernel development. You can always test the spinlock code separately in userspace, though! – Kerrek SB Jul 04 '13 at 15:24
  • @KerrekSB: What's the difference between spinlock in userspace and spinlock in kernel space? – Charles0429 Jul 05 '13 at 01:16