Our team is currently working on custom device.
There is Cyclone V board with COM Express amd64-based PC plugged into it. This board works as PCIe native endpoint. It switches on first, then it switches PC on. PC has linux running on it with kernel 4.10 and some drivers and software working with PCI BAR0 via MMIO.
System works flawlessly until the first reboot from linux terminal. On the next boot MMIO read access is broken, while MMIO write access is OK.
Let's say there are two offsets to read, A and B, with values 0xa and 0xb respectively. Now if we read bytes from these offsets, it seems as if there is delay by 8 read operations in values retrieved:
- read A ten times - returns 0xa every time
- read B eight times - returns 0xa every time
- read B ten times - returns 0xb every time
- read A once - returns 0xb
- read B seven times - returns 0xb every time
- read B once - returns 0xa
- read B ten times - returns 0xb
In case offsets A and B are within the same 64-bit word all works as expected.
MMIO access is done via readb/readw/readl/readq functions, the actual function used does not affect this delay at all.
Sequential reboots may fix or break MMIO reads again.
From the linux point of view, mmiotrace gives the same picture with broken data.
From device point of view, signaltap logic analyzer shows valid data values on PCIe core bus.
We have no PCI bus analyzer device, so we do not know any possibility to check data exchange in between those two points.
What can be a reason for such behaviour and how it can be fixed?