0

I have a device that I wish to enable hotplug on. It has BARs larger than NVMe standards. My BIOS (Dell R640) has no options for preallocating BAR space.

When I try to start with an empty system and add the cards, I get the following errors:

[  120.951915] pci 0000:67:00.0: BAR 0: no space for [mem size 0x02000000]
[  120.951919] pci 0000:67:00.0: BAR 0: failed to assign [mem size 0x02000000]
[  120.951923] pci 0000:67:00.1: BAR 0: no space for [mem size 0x02000000]
[  120.951927] pci 0000:67:00.1: BAR 0: failed to assign [mem size 0x02000000]
[  120.951931] pci 0000:67:00.1: BAR 1: no space for [mem size 0x00020000]
[  120.951935] pci 0000:67:00.1: BAR 1: failed to assign [mem size 0x00020000]
[  120.951939] pci 0000:67:00.0: BAR 1: no space for [mem size 0x00010000]
[  120.951942] pci 0000:67:00.0: BAR 1: failed to assign [mem size 0x00010000]
 

My /etc/default/grub is the following:

GRUB_CMDLINE_LINUX="crashkernel=auto spectre_v2=retpoline rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet pci=assign-busses,hpbussize=4,realloc=on,hpmemsize=8G"

And my grub.cfg contains the following additions.

pciehp.pciehp_force=1 pci=pcie_bus_perf

I cannot get any of these to work for hotplug unless I remove the root complex:

echo 1 > /sys/bus/pci/devices/<root_port>/remove
*add the card*
echo 1 > /sys/bus/pci/rescan

This enables all the BARs as needed, however it brings down the entire root complex and all devices on it. Simply rescanning doesn't help.

How can I tell the kernel to preallocate so I don't need to remove the entire Root Port? I know that the assigned BDF will be the exact same every time (and they will be in the same physical ports, same BAR sizes, etc.). I feel like this has to be an option.

  • The problem is possibly multiple upstream PCIe switches need to know the sizes as well as the device. This can't be changed while any devices on the root complex are running. It needs to be done by the platform firmware before boot. – stark Dec 10 '20 at 21:54
  • @stark that makes sense, but if it works when I remove and rescan the top level root port i get full access, why can't I pass a bootarg or something to hard-allocate it at kernel startup? – IDLacrosseplayer Dec 10 '20 at 23:30
  • That's too late. All the device IDs and BARs are set up before the kernel boots by the firmware. There's some code I didn't know about for SRIOV post-boot allocation that you might be able to leverage. see drivers/pci/setup-bus.c – stark Dec 11 '20 at 19:38
  • @stark, actually you may perform quite a lot of quirks in kernel for certain device. I think what OP is talking about is the PCI bridge windows which by some reason are not big enough. I recommend to switch to 64-bit MMIO for PCI bridge (something like *Above 4G MMIO* in the BIOS menu). In that case PCI root bridge gets a lot of space for devices. You may check, btw, MTRR and PAT for the settings as well to see if somebody sets regions to UC- (usually it's done for IO). To OP: without dump of `dmesg` and `lspci -t` it's hard to tell what exactly is a bottle neck. – 0andriy Dec 12 '20 at 20:40

0 Answers0