0

I've managed to compile a driver for an ARM based device, but the driver crashed when I try to load it. here is the output from cpuinfo:

Processor       : ARMv7 Processor rev 2 (v7l)
BogoMIPS        : 999.42
Features        : swp half thumb fastmult vfp edsp neon vfpv3
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x3
CPU part        : 0xc08
CPU revision    : 2

Here is the uname -r output

2.6.37

modinfo driver.ko

filename:       cp210x.ko
description:    Silicon Labs CP210x RS232 serial adaptor driver
license:        GPL
vermagic:       2.6.37 mod_unload ARMv7
vermagic:       2.6.37 mod_unload modversions ARMv5
parm:           debug:Enable verbose debugging messages

As you can I've added an extra vermagic (2.6.37 mod_unload ARMv7) so it will match the target system.

So if I understand this correct, I've compiled this module for an ARMv5 cpu, while the target is v7. Could this be the cause of the device driver crashing?

The device has this driver, but its embedded into an other driver package from the hw producer. This package also load some drivers that we cannot use. This driver package is not load, but I guess this indicate that this driver should work on this hardware some how.

here is the crash log

modprobe cp210x.ko
Unable to handle kernel NULL pointer dereference at virtual address 0000000a
pgd = ca1fc000
[0000000a] *pgd=870dd031, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1]
last sysfs file: /sys/kernel/uevent_seqnum
Modules linked in: dahdi_dummy dahdi cmemk syslink ipt_MASQUERADE nf_nat iptable_filter ip_tables ipt_LOG xt_state nf_conntrack_ftp nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 xt_recent xt_mac xt_limit work_led reset_button ipv6
CPU: 0    Not tainted  (2.6.37 #1)
PC is at sys_init_module+0xfe0/0x1460
LR is at sys_init_module+0xe7c/0x1460
pc : [<c00836e8>]    lr : [<c0083584>]    psr: 20000013
sp : cc5e9ed0  ip : bf3828dc  fp : cc5e8000
r10: bf385ca8  r9 : cf3bcb4e  r8 : 000000c5
r7 : 00000027  r6 : bf382544  r5 : bf38266c  r4 : bf385ca8
r3 : 00000000  r2 : c7c9f000  r1 : 0000000a  r0 : 0000000a
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5387d  Table: 8a1fc019  DAC: 00000015
Process modprobe (pid: 2676, stack limit = 0xcc5e82e8)
Stack: (0xcc5e9ed0 to 0xcc5ea000)
9ec0:                                     bf382544 00000001 000ac048 bf382550
9ee0: 000000c5 cf3bd5a4 cf3b8000 000055f4 cf3bd20c cf3bd128 cf3bc2a0 c7c9f000
9f00: 0000266c 000028dc 00000000 00000000 00000017 00000018 00000010 0000000d
9f20: 00000009 00000000 6e72656b 00006c65 00000000 00000000 00000000 00000000
9f40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9f60: 00000000 00000000 c9a19540 00000000 ca2403c0 00000006 c9a19540 00000000
9f80: ca2403c0 000055f4 00000000 00000006 00000080 c0037c28 cc5e8000 00000000
9fa0: 00000001 c0037a80 000055f4 00000000 000ac998 000055f4 000ac048 000ac978
9fc0: 000055f4 00000000 00000006 00000080 000ac008 000ac028 000ac998 00000001
9fe0: bebaf968 bebaf958 00017764 40214740 60000010 000ac998 c1e38bcc 03de8ad9
[<c00836e8>] (sys_init_module+0xfe0/0x1460) from [<c0037a80>] (ret_fast_syscall+0x0/0x30)
Code: e7923103 e1a03133 e3130001 15963128 (17d33000)
---[ end trace 6e8943127db36208 ]---
Segmentation fault
Mr Zach
  • 495
  • 4
  • 18
  • After some more trying, get this error now when I try to load the .ko file modprobe -v /lib/modules/2.6.37/misc/cp210x.ko cp210x: Unknown symbol mutex_lock_nested (err 0) modprobe: can't load module /lib/modules/2.6.37/misc/cp210x.ko (/persistent/etc/cp210x.ko): unknown symbol in module, or unknown parameter But I cant fint "mutex_lock_nested" anywhere in the cp210x.c file. Should support for this be disabled somewhere in the kernel config? – Mr Zach Mar 28 '17 at 23:59
  • Please take look at this [link](http://stackoverflow.com/questions/26039351/kernel-module-wont-link-symbol-mutex-lock-nested-not-found). Maybe you get some help from it. – Gaurav Pathak Mar 29 '17 at 09:14
  • Could not get any help from that link, but thank you – Mr Zach Mar 29 '17 at 22:06

2 Answers2

0

I hade to change the cp210x.c file and comment out where there was any use of mutex. this was the only place:

static void cp210x_close(struct usb_serial_port *port)
{
        dbg("%s - port %d", __func__, port->number);

        usb_serial_generic_close(port);

        /* mutex_lock(&port->serial->disc_mutex);*/
        if (!port->serial->disconnected)
                cp210x_set_config_single(port, CP210X_IFC_ENABLE, UART_DISABLE);
        /* mutex_unlock(&port->serial->disc_mutex);*/
}
Mr Zach
  • 495
  • 4
  • 18
  • No no.. you should not remove those mutex lock API, they were there for a reason. If your module is being accessed by more than one process then the absence of mutex lock can create other processes to behave weirdly. Does your module have `linux/mutex.h` file included? – Gaurav Pathak Mar 30 '17 at 09:28
  • Yes, should not remove it.. but I cannot load this device driver on the device with this mutex calls. each usb serial port will only be accessed by one process and if this process/crashes etc the device will reboot. I have no control over the installed Linux kernel on the device. I guess the reason I have to remove this is because the kernel is build without the support for mutex (I dont know if that's even an option) This device driver does not have linux/mutex.h included, but I tried to add it to se if there was any difference - but it's not. – Mr Zach Mar 30 '17 at 21:03
0

Are you trying to load a kernel module that was compiled for one kernel into another kernel? Linux modules (what you call drivers) are only supposed to be loaded into the kernel that they were compiled for. Even the same version of the kernel with different configuration or compiler settings could render the module incompatible. So playing with version magic is very dangerous.

The reason your driver is crashing is because it is trying to access kernel data structures using incorrect layout, so it is not actually reading the attributes it thinks it should be reading.

Changing architecture from ARMv7 to ARMv5 is very drastic configuration change that will completely change the memory layout of kernel data structures.

Unlike some other operating systems like Windows, Linux does not have an abstraction layer or fixed memory layouts that let you load the same loadable module into different versions of the kernel.

Vlad
  • 9,180
  • 5
  • 48
  • 67
  • What I'm trying to do is to cross compile a module for an device where there is no possibility to compile anything (just some hardware with an Linux image). I used arm-linux-gnueabihf to compile it and then it changes from v5 to v6. My understanding of the difference between 5,6 and 7 is that arm has introduced support for hard float in v6. By now the module loads when I modprobe - but haven't tested any hardware communication yet. – Mr Zach Mar 30 '17 at 21:18
  • Usually Linux kernel and drivers do not use floating point unless it is something highly specialized. It's fine that you cannot compile on the device itself, I cross compile all the time. What you need to do is to cross compile both Kernel and all the modules for it including the drivers. Then transfer that version of the kernel to the device. Is your device supported by Poky project? It makes things much eaiser. – Vlad Mar 31 '17 at 00:48