I am first of all looking for debugging tips. If some one can point out the one line of code to change or the one peripheral config bit to set to fix the problem, that would be terrific. But that's not what I'm hoping for; I'm looking more for how do I go about debugging it.
Googling "msleep hang linux kernel site:stackoverflow.com" yields 13 answers and none is on the point, so I think I'm safe to ask.
I rebuild an ARM Linux kernel for an embedded TI AM1808 ARM processor (Sitara/DaVinci?). I see the all the boot log up to the login: prompt coming out of the serial port, but trying to login gets no response, doesn't even echo what I typed.
After lots of debugging I arrived at the kernel and added debugging code between line 828 and 830 (yes, kernel version is 2.6.37). This is at this point in the kernel mode before 'sbin/init' is called:
http://lxr.linux.no/linux+v2.6.37/init/main.c#L815
Right before line 830 I added a forever loop printk and I see the results. I have let it run for about a couple of hour and it counts to about 2 million. Sample line:
dbg:init/main.c:1202: 2088430
So it has spit out 60 million bytes without problem.
However, if I add msleep(1000) in the loop, it prints only once, i.e. msleep () does not return.
Details: Adding a conditional printk at line 4073 in the scheduler that condition on a flag that get set at the start of the forever test loop described above shows that the schedule() is no longer called when it hangs:
http://lxr.linux.no/linux+v2.6.37/kernel/sched.c#L4064
The only selections under .config/'Device Drivers' are: Block devices I2C support SPI support
The kernel and its ramdisk are loaded using uboot/TFTP. I don't believe it tries to use the Ethernet. Since all these happened before '/sbin/init', very little should be happenning.
More details: I have a very similar board with the same CPU. I can run the same uImage and the same ramdisk and it works fine there. I can login and do the usual things.
I have run memory test (64 MB total, limit kernel to 32M and test the other 32M; it's a single chip DDR2) and found no problem. One board uses UART0, and the other UART2, but boot log comes out of both so it should not be the problem.
Any debugging tips is greatly appreciated. I don't have an appropriate JTAG so I can't use that.