I have a working PXE server using Puppet Razor (now end of lifed, but we still need it to work a little longer). It has built hundreds of systems for us.
I can go to most systems here and manually tftp files from that server and get files whose MD5SUMs match perfectly.
We have some systems in a remote location, though, which aren't able to TFTP any files properly. They get their DHCP address, but fail to download the vmlinuz file needed to continue. If I go to a system there that is up and running, and try to manually tftp a file, I get a file whose MD5SUM is incorrect. If I then repeat the task, I always get exactly the same, incorrect, MD5SUM. If I instead rsync the file from the tftp server, I get exactly the file as expected, with the correct MD5SUM.
The tftp transfers are painfully slow, often taking 30-60 seconds for a file that rsync transfers in under a second. So network bandwidth isn't the issue. Something else is going on.
Where should I start looking to debug this? It's darned weird.