I am working on implementing an FPGA PCIe endpoint to prototype the interface for one project.
The FPGA platform I am using is Synopsys HAPS DX7 S6 featuring a Xilinx Virtex-7 980T device. Besides, I am using a Xilinx cable to program the FPGA via the JTAG interface. I remotely login to the Linux host connected to the cable and FPGA for programming and experiments since I am doing a remote internship.
Currently, I start with the Xilinx DMA subsystem for PCIe (XDMA) IP, compile the example design, download the bitstream, and reboot my Linux host to enumerate the device. In detail,
- instantiate DMA subsystem for PCIe (It looks like I am not allowed to embed images before earning 10 reputations... please click the links)
- I modify the top design and constraint file.
a. Debug_clk_p/n is used as the clock for my VIO core providing the reset signal because I remotely access the board as mentioned above, and cannot manually control any switches/buttons. Therefore, sys_rst_n is commented out.
b. Debug_clk_p/n are constrained to the onboard clock source. Sys_clk_p and sys_clk_n pair are constrained to package pins (D7 and D8) on the PCIe bank.
FPGA platform PCIe bank pinout
- Bitstream is downloaded to FPGA via JTAG. I can confirm the bitstream is successfully loaded since I can control the VIO core.
a. I run the command 'lspci | grep Xilinx' but did not find the device
b. I run the command 'echo 1 > /sys/bus/pci/rescan' trying to re-enumerate the PCI bus but did not work
c. The next step is supposed to be 'reboot the host' to enumerate the endpoint and allocate the memory. Nevertheless, issues came up.
Issues description:
The Linux host cannot boot up normally; I cannot log in to the host as usual with my credentials. We can confirm the underlying reason is the FPGA PCIe endpoint bitstream because the reboot can be finished when a bitstream has no matter with PCIe stuff is programmed.
My colleague came to the lab and checked in person. He said that the host can be booted up but ethernet is impacted as shown in the figures. We can see eth0 is set up in standard cases (FPGA is blank or programmed with non-PCIe design) while eth0 fails to work (when FPGA is programmed with the PCIe design). It is deduced this is why we cannot log in normally.
My colleague manually disconnects FPGA from the host and reboots the host. We find we can log in again but the IP address has changed. In other words, I have to ssh another address remotely to the host.
Important additional points:
I think our connectivity between FPGA PCIe and host is okay because when I start, the FPGA PCIe endpoint can be detected. However, the bitstream is compiled from the Synopsys example project using Synopsys Protocompiler. Unfortunately, my colleague cannot locate the original bitstream now and I cannot compile the example project due to the tool version issues.
I tried other example designs of IP cores such as AXI for PCIe and PCIe endpoint following the similar configuration (mostly default and example design) and constraints. However, the issues are similar; I cannot log in.
As the host cannot be rebooted and my colleague is still mainly working from home, we cannot troubleshoot it very efficiently. Right now, we connect the JTAG to another personal laptop to debug by any chance in case the original Linux host fails to start unexpectedly.
To be honest, the ethernet issue is kinda weird to me, since the way I understand this endpoint would be some kind of peripheral like our USB mouse or keyboard. Even if it has some issues, should not impact much on other devices. Some other online posts mentioned the BAR setting problem but my 1MB BAR0 should not result in problems.
Thanks for any possible suggestions (on implementation or debugging) or reasoning from you in advance! Please let me know if any details might be helpful so I can update my post.
Best, Tao