0

I'm writing a simple USB driver to drive a stepper motor based on USB Skeleton 2.2 Driver, kernel 3.8. The basic version is running properly. As a advancement, I introduced KTHREAD to call the DEVICE_WRITE (skel_write) (), so that the driver will be available for other tasks & requests.

Calling procedure : USER (request) -> DEVICE_IOCTL -> KTHREAD -> DEVICE_WRITE.

In this scenario, when I call the DEVICE_WRITE multiple times from KTHREAD through a loop, everything works fine. Then after some iterations, kernel gets messed up, Otherwise if called directly works fine. Upon seeing the log file, the error is :

Dec 30 01:15:14 mit kernel: [  962.316843] device_write(efed1180,2,10),ioused : 1
Dec 30 01:15:14 mit kernel: [  962.316900] data : 0, motor_cnt : 2, master_counter : 20
Dec 30 01:15:14 mit kernel: [  962.366498] data : 1, motor_cnt : 2, master_counter : 21
Dec 30 01:15:14 mit kernel: [  962.416116] Write over, going for sleep
Dec 30 01:15:14 mit kernel: [  962.416125] file : efed1180,data : 2,i : 11
Dec 30 01:15:14 mit kernel: [  962.416128] device_write(efed1180,2,10),ioused : 1
Dec 30 01:15:14 mit kernel: [  962.416166] BUG: unable to handle kernel NULL pointer dereference at   (null)
Dec 30 01:15:14 mit kernel: [  962.416254] IP: [] skel_write+0xd7/0x360 [usbstep]
Dec 30 01:15:14 mit kernel: [  962.416294] *pdpt = 0000000000000000* pde= f0002accf0002acc
Dec 30 01:15:14 mit kernel: [  962.416332] Oops: 0000 [#1] SMP
Dec 30 01:15:14 mit kernel: [  962.416363] Modules linked in: usbstep(OF) parport_pc(F) ppdev(F)   bnep rfcomm bluetooth snd_hda_codec_hdmi uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev snd_hda_codec_idt coretemp snd_hda_intel kvm snd_hda_codec snd_hwdep(F) snd_pcm(F) snd_page_alloc(F) joydev(F) snd_seq_midi(F) snd_seq_midi_event(F) snd_rawmidi(F) hp_wmi lib80211_crypt_tkip snd_seq(F) snd_seq_device(F) snd_timer(F) sparse_keymap radeon wl(POF) lib80211 ttm drm_kms_helper cfg80211 drm hp_accel lis3lv02d mei input_polldev wmi i2c_algo_bit video(F) intel_ips mac_hid snd(F) lpc_ich soundcore(F) microcode(F) lp(F) parport(F) psmouse(F) serio_raw(F) r8169 ahci(F) libahci(F) [last unloaded: usbstep]
Dec 30 01:15:14 mit kernel: [  962.416866] Pid: 2997, comm: mitesh Tainted: PF          O 3.8.0-26-generic #38-Ubuntu Hewlett-Packard HP ProBook 4520s/1411
Dec 30 01:15:14 mit kernel: [  962.416928] EIP: 0060:[] EFLAGS: 00010287 CPU: 2
Dec 30 01:15:14 mit kernel: [  962.416960] EIP is at skel_write+0xd7/0x360 [usbstep]
Dec 30 01:15:14 mit kernel: [  962.416989] EAX: f0665b84 EBX: 00000014 ECX: 000000d0 EDX: 00000014
Dec 30 01:15:14 mit kernel: [  962.417024] ESI: f0665b40 EDI: 00000000 EBP: efddbf40 ESP: efddbf04
Dec 30 01:15:14 mit kernel: [  962.417059]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Dec 30 01:15:14 mit kernel: [  962.417089] CR0: 8005003b CR2: 00000000 CR3: 019d1000 CR4: 000007f0
Dec 30 01:15:14 mit kernel: [  962.417124] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
Dec 30 01:15:14 mit kernel: [  962.417158] DR6: ffff0ff0 DR7: 00000400
Dec 30 01:15:14 mit kernel: [  962.417181] Process mitesh (pid: 2997, ti=efdda000 task=f0bed9b0 task.ti=efdda000)
Dec 30 01:15:14 mit kernel: [  962.417223] Stack:
Dec 30 01:15:14 mit kernel: [  962.417236]  f0665b84 efddbf58 efddbf58 0000000a 00000001 efddbf58 00000000 efddbf40
Dec 30 01:15:14 mit kernel: [  962.417301]  c1609d81 00000002 f06c5d40 00000014 0000000c efddbf58 f1487408 efddbf6c
Dec 30 01:15:14 mit kernel: [  962.417398]  f8585546 00000000 efed1180 efddbf58 0000000b f6eb0032 aa092dff f6eb7ebc
Dec 30 01:15:14 mit kernel: [  962.417475] Call Trace:
Dec 30 01:15:14 mit kernel: [  962.417504]  [] ? printk+0x4d/0x4f
Dec 30 01:15:14 mit kernel: [  962.417559]  [] tele+0x86/0xc0 [usbstep]
Dec 30 01:15:14 mit kernel: [  962.417618]  [] ? skel_write+0x360/0x360 [usbstep]
Dec 30 01:15:14 mit kernel: [  962.417691]  [] kthread+0x94/0xa0
Dec 30 01:15:14 mit kernel: [  962.417744]  [] ? __hrtimer_start_range_ns+0x2e0/0x460
Dec 30 01:15:14 mit kernel: [  962.417819]  [] ret_from_kernel_thread+0x1b/0x28
Dec 30 01:15:14 mit kernel: [  962.417886]  [] ? kthread_create_on_node+0xc0/0xc0
Dec 30 01:15:14 mit kernel: [  962.417951] Code: c0 89 c6 0f 84 83 01 00 00 83 c3 0a b8 00 0e 00 00 81 fb 00 0e 00 00 b9 d0 00 00 00 0f 46 c3 89 45 f0 8d 46 44 8b 55 f0 89 04 24 <8b> 07 e8 52 9f ee c8 85 c0 89 45 e4 0f 84 0f 01 00 00 8d 47 54
Dec 30 01:15:14 mit kernel: [  962.418433] EIP: [] skel_write+0xd7/0x360 [usbstep] SS:ESP 0068:efddbf04
Dec 30 01:15:14 mit kernel: [  962.418530] CR2: 0000000000000000
Dec 30 01:15:14 mit kernel: [  962.433930] ---[ end trace 63245eeeb64414aa ]---

Here goes the code : KTHREAD

int tele(void *__tele_data) {
        struct tele_data *tele_data = __tele_data;
        int i=0;
        char *dptr=NULL;
        char numb[4];
        sprintf(numb,"%d",tele_data->num);
        dptr=numb;
        for(i=0;i<30;i++) {
        is_ioctl_used=1;
                printk("file : %p,data : %s,i : %d\n", tele_data->file,dptr,i);
                skel_write(tele_data->file,(char *)dptr, 10, 0);
                printk("Write over, going for sleep\n");
        }
        return 0;
}

DEVICE_WRITE -

static ssize_t skel_write(struct file *file, const char *user_buffer,
                      size_t count, loff_t *ppos)
{
    struct usb_skel *dev;
    int retval = 0,i = 0,motor_count,dir=0;
    struct urb *urb = NULL;
    char *buf = NULL;
    char *buf1 = NULL;
    size_t writesize = min(count+10, (size_t)MAX_TRANSFER);
    printk(KERN_INFO "device_write(%p,%s,%d),ioused : %d\n", file, user_buffer, count,is_ioctl_used);
    dev = file->private_data;

    // verify that we actually have some data to write 
    if (count == 0)
            goto exit;

    /*
     * limit the number of URBs in flight to stop a user from using up all
     * RAM
     */

    if (!(file->f_flags & O_NONBLOCK)) {
            if (down_interruptible(&dev->limit_sem)) {
                    retval = -ERESTARTSYS;
                    goto exit;
            }
    } else {
            if (down_trylock(&dev->limit_sem)) {
                    retval = -EAGAIN;
                    goto exit;
            }
    }

    spin_lock_irq(&dev->err_lock);
    retval = dev->errors;
if (retval < 0) {
            // any error is reported once 
            dev->errors = 0;
            // to preserve notifications about reset 
            retval = (retval == -EPIPE) ? retval : -EIO;
    }
    spin_unlock_irq(&dev->err_lock);
    if (retval < 0)
            goto error;

    /* create a urb, and a buffer for it, and copy the data to the urb */
    buf1=(char *)kmalloc(sizeof(char)*20,GFP_KERNEL);  //Allocate 2nd buffer.
    if(is_ioctl_used) {   //Whether the write function is called from IOCTL or Directly (echo > /dev/stepper)
            sprintf(buf1,user_buffer);
    } else {
            if (copy_from_user(buf1, user_buffer,count)) {
                    retval = -EFAULT;
                    goto error;
            }
    }
    motor_count=simple_strtol(buf1,NULL,10);
    if(motor_count<0) {  //Rotation counts of stepper motor.
            motor_count=motor_count * -1;  //If motor_count<0 then rotate in anti-clock direction.
            dir=1;
    }
            urb = usb_alloc_urb(0, GFP_KERNEL);
            if (!urb) {
                    retval = -ENOMEM;
                    goto error;
            }

            buf = usb_alloc_coherent(dev->udev, writesize, GFP_KERNEL,
                            &urb->transfer_dma);
     if (!buf) {
                    retval = -ENOMEM;
                    goto error;
            }

            /* this lock makes sure we don't submit URBs to gone devices */
                    mutex_lock(&dev->io_mutex);
            if (!dev->interface) {          /* disconnect() was called */
                    mutex_unlock(&dev->io_mutex);
                    retval = -ENODEV;
                    goto error;
            }

            /* initialize the urb properly */
            usb_fill_int_urb(urb, dev->udev,
                            usb_sndintpipe(dev->udev, dev->bulk_out_endpointAddr),
                            buf, writesize, skel_write_bulk_callback, dev,dev->bInterval);
            urb->transfer_flags |= URB_NO_TRANSFER_DMA_MAP;
            usb_anchor_urb(urb, &dev->submitted);

    for(i=0;i<motor_count;i++) {  //Loop to rotate motor based on counts.
            printk("data : %d, motor_cnt : %d, master_counter : %d\n",ptr->data,motor_count,master_counter);
            if(dir==0) ptr=ptr->next;
            else ptr=ptr->prev;
            // Fill the buffers.
            buf[0]=0x01;
            buf[1]=0;
            buf[2]=ptr->data;

            /* send the data out the bulk port */
            retval = usb_submit_urb(urb, GFP_KERNEL);
    if (retval) {
                            dev_err(&dev->interface->dev,
                                            "%s - failed submitting write urb, error %d\n",
                                            __func__, retval);
                            mutex_unlock(&dev->io_mutex);
                            goto error_unanchor;

                    }
            if(++master_counter && master_counter > 47) master_counter=0;
            /*
             * release our reference to this urb, the USB core will eventually free
             * it entirely
             */
                            mdelay(50); //Delay is required to match with motor speed. 

            }
                    mutex_unlock(&dev->io_mutex);
                    usb_free_coherent(dev->udev, writesize, buf, urb->transfer_dma);
                    kfree(buf1);
                    usb_free_urb(urb);

                    is_ioctl_used=0;
            return writesize;
error_unanchor:
            usb_unanchor_urb(urb);
error:
    if (urb) {
            usb_free_coherent(dev->udev, writesize, buf, urb->transfer_dma);
            usb_free_urb(urb);
    }
    up(&dev->limit_sem);

exit:
    return retval;
}

I'm new to kernel programming and might be missing out something.

Mitesh G
  • 69
  • 2
  • 9

2 Answers2

1

I don't know if this is the root cause of your problem, but it seems like you have a number of issues in your tele() function:

    int tele(void *__tele_data) {
            struct tele_data *tele_data = __tele_data;
            int i=0;
            char *dptr=NULL;
            char numb[4];
            sprintf(numb,"%d",tele_data->num);

Here, you sprintf() the number into the numb buffer. What is the range of tele_data->num? Would it ever take more than 4 characters (including the terminating NUL character)? Also, you're not recording how many bytes were printed in the buffer. Seems like you'd want to know that for use below...

            dptr=numb;

Okay, so now dptr point to numb. Which means it points to a character buffer that has a maximum of 4 bytes, but...

            for(i=0;i<30;i++) {
                    is_ioctl_used=1;
                    printk("file : %p,data : %s,i : %d\n", tele_data->file,dptr,i);
                    skel_write(tele_data->file,(char *)dptr, 10, 0);

In the skel_write() line above, you're requesting 10 bytes to be written. That's 6 more than is available. So you could be smashing the stack here.

I'm not convinced it's your only issue, but it does appear to be a problem.

Just a couple of other minor things to point out. You don't need the cast on dptr in the skel_write() line... it's already a char *. Be wary of casting as it can hide an unintentional mismatch of types if the type of the variable changes. Also, the indentation in your code is all over the place. I realize you're just learning, but get in the habit of good practices here. It's really hard to read through your skel_write() implementation. The are likely a few other issues there, and something as simple as correct indentation can help readers understand the flow, and potentially see the issue.

Finally, don't give up. Kernel programming is hard: there are lots of moving parts, concurrency, locking, caching, and a very asynchronous style of programming. OTOH, you're down close to the bare metal of your processor and system, and it's quite rewarding.

John Szakmeister
  • 44,691
  • 9
  • 89
  • 79
  • Hello John sir, Everything in skel_write was messing with kernel, passing the data was not the actual culprit. I think calling skel_write() function was the mistake as every operation was leading to kernel crash after few successful operations. So, finally I re-structured the code, put the initialization part of URB in probe as many drivers do and fired data(usb_submit_urb) from kthread and it worked fine. My next attempt is to add a queue so that multiple requests will be stored in queue while the driver is busy serving current request. Is it feasible enough to move on such type of driver? – Mitesh G Feb 22 '14 at 04:42
0

Mit,

May you need to scrutinize the kthreads behaviour w.r.t operations you are doing from inside it

See: In what context Kernel Thread runs in Linux?

Community
  • 1
  • 1
Sun
  • 1,505
  • 17
  • 25
  • Hello Sir, After going through all operations, I'm still uncertain about the exact reason cause almost every statement is causing an error. 1 thing I noticed that while writing to device via thread, if I switch tabs of terminal then kernel used to crash immediately. Sounds weird but kernel can change thread's state and running CPU which might be causing some issues. Calling skel_write() from thread was the mistake. So, I re-structured the code, put the initialization part of URB in probe as many drivers do and fired data(usb_submit_urb) from kthread and it worked fine. – Mitesh G Feb 22 '14 at 04:48
  • My next attempt is to add a queue so that multiple requests will be stored in queue while the driver is busy serving current request. Is it feasible enough to move on such type of driver? – Mitesh G Feb 22 '14 at 04:49