0

I am working on implementing SMP support in Linux kernel for Marvell PXA2128 ARM SoC. I am using Linus Torvald kernel as base kernel. Kernel version is 3.5. I have added SMP support in Linux kernel, I am able to boot with the second core but sometimes my kernel crashes with "Attempted to kill Init" message, that is, init process of initramfs dies somehow, don't know why. I thought L1 cache of second core is corrupted, so I invalidated the L1 cache of second core before it enters the Linux kernel execution. Crash log of kernel is like

[   16.413024] Freeing init memory: 268K
[   16.658111] tmpfs: No value for mount option 'strictatime'
[   16.809997] scsi 0:0:0:0: Direct-Access     SanDisk  Cruzer Blade     1.27 PQ: 0 ANSI: 6
[   16.827545] tmpfs: No value for mount option 'strictatime'
[   16.972473] sd 0:0:0:0: Attached scsi generic sg0 type 0
[   16.972930] sd 0:0:0:0: [sda] 15330304 512-byte logical blocks: (7.84 GB/7.30 GiB)
[   16.974487] sd 0:0:0:0: [sda] Write Protect is off
[   16.976104] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[   17.369537]  sda: sda1 sda2
[   17.377258] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   17.602966] tmpfs: No value for mount option 'strictatime'
[   18.074981] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
[   18.074981] 
[   18.222442] [<c001691c>] (unwind_backtrace+0x0/0x128) from [<c046beb0>] (dump_stack+0x20/0x24)
[   18.301177] [<c046beb0>] (dump_stack+0x20/0x24) from [<c046bfcc>] (panic+0x94/0x1d4)
[   18.378265] [<c046bfcc>] (panic+0x94/0x1d4) from [<c002ad08>] (do_exit+0x390/0x7ac)
[   18.454956] [<c002ad08>] (do_exit+0x390/0x7ac) from [<c002b37c>] (do_group_exit+0x0/0xc4)
[   18.533111] CPU0: stopping
[   18.605621] [<c001691c>] (unwind_backtrace+0x0/0x128) from [<c046beb0>] (dump_stack+0x20/0x24)
[   18.687225] [<c046beb0>] (dump_stack+0x20/0x24) from [<c00145ac>] (handle_IPI+0x104/0x174)
[   18.770141] [<c00145ac>] (handle_IPI+0x104/0x174) from [<c0008590>] (gic_handle_irq+0x60/0x68)
[   18.854888] [<c0008590>] (gic_handle_irq+0x60/0x68) from [<c000e6c0>] (__irq_svc+0x40/0x70)
[   18.940093] Exception stack(0xc0685f38 to 0xc0685f80)
[   19.022155] 5f20:                                                       c06c4c28 a0000093
[   19.108398] 5f40: 00000001 60400100 c0684000 c06c48c8 c047560c c0691438 c177a080 562f5842
[   19.188018] SMP: failed to stop secondary CPUs
[   19.275970] 5f60: 00000000 c0685f9c c0685f40 c0685f80 c0020668 c0010068 60000013 ffffffff
[   19.363708] [<c000e6c0>] (__irq_svc+0x40/0x70) from [<c0010068>] (cpu_idle+0x94/0xdc)
[   19.450805] [<c0010068>] (cpu_idle+0x94/0xdc) from [<c0462d24>] (rest_init+0x7c/0x94)
[   19.537292] [<c0462d24>] (rest_init+0x7c/0x94) from [<c063f89c>] (start_kernel+0x328/0x380)

Now when my kernel boots successfully, the output of cat /proc/interrupts is like

bash-4.2# cat /proc/interrupts 
           CPU0       CPU1       
 39:        101          0       GIC  pxa_i2c-i2c
 45:          0      26692       GIC  timer0
 46:      26578          0       GIC  timer1
 52:         27          0       GIC  olpc-ec-1.75
 58:          0          0       GIC  mmp-vmeta
 60:       1142          0       GIC  UART3
 71:          0          0       GIC  mmc2
 72:          0          0       GIC  olpc-kbd
 73:      25978          0       GIC  pxa168fb-dss
 76:       2282          0       GIC  ehci_hcd:usb1
 84:        246          0       GIC  mmc0
 85:       9820          0       GIC  mmc1
132:          0          0       ICU  rtc Alrm
133:          0          0       ICU  rtc 1Hz
137:          0          0       ICU  galcore interrupt service
139:         88          0       ICU  galcore interrupt service for 2D
141:         20          0       ICU  pxa_i2c-i2c
143:          0          0       ICU  pxa_i2c-i2c
145:        328          0       ICU  pxa_i2c-i2c
186:         52          0       ICU  mmc3
252:          0          0      GPIO  hsdet-gpio
253:          0          0      GPIO  hdmi-hpd
270:          0          0      GPIO  d4280000.sdhci cd
278:          0          0      GPIO  sdhci_wakeup_irq, sdhci_wakeup_irq, sdhci_wakeup_irq, sdhci_wakeup_irq
335:          0          0      GPIO  micdet-gpio
365:          0          0      GPIO  DCON
368:          0          0      GPIO  olpc-switch-1.75-lid
369:          0          0      GPIO  olpc-switch-1.75-ebook
393:          0          0      GPIO  olpc-ec-1.75-wake
IPI0:          0          0  Timer broadcast interrupts
IPI1:       3508       4069  Rescheduling interrupts
IPI2:          0          0  Function call interrupts
IPI3:          7        429  Single function call interrupts
IPI4:          0          0  CPU stop interrupts
Err:          0
bash-4.2#

Please give me a clue what important I am missing in SMP implementation.

SSC
  • 1,311
  • 5
  • 18
  • 29
Darshan Prajapati
  • 843
  • 2
  • 11
  • 33
  • 1
    This question is near impossible to answer. Please get and look at the u-boot source for your CPU/family. There are all sorts of things that must be done to get SMP to work. The [GIC irqchip driver calls the `handle_IPI()`](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/irqchip/irq-gic.c#n262). That code is in [arm/kernel/smp.c](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arm/kernel/smp.c); these are some of the messages and stack trace in your log. There are many things that need to be setup and some is very dependant on the SOC. – artless noise Dec 07 '14 at 19:51
  • In this case it looks like the 2nd CPU is suppose to respond to a *peripheral private* or *CPU-to-CPU* interrupt. This seems to imply the the per-CPU GIC register bank of the 2nd CPU is setup; or this is a completely wrong path for this SOC. Generally the non-boot cores are stuck on `WFI/WEF` while booting. Do you know what the 2nd core is doing? – artless noise Dec 07 '14 at 20:00
  • Second core is out of WFI when kernel boots, primary CPU gives SGI to it and it starts its kernel execution. What general settings for second core I could be missing? – Darshan Prajapati Dec 08 '14 at 04:54
  • **CPU0: stopping** and **SMP: failed to stop secondary CPUs**. Why isn't the second core responding to this? Are you sure the local GIC registers are set correctly for the 2nd core so it gets the SGI/IPI interrupt? Can you say what SOC it is? Can you tell us what Linux version? – artless noise Dec 08 '14 at 17:43
  • @artlessnoise I have edited the question and added info about Kernel Version and SoC name. Please have a look. – Darshan Prajapati Dec 09 '14 at 05:30

0 Answers0