We're compiling xgboost v0.7 from source on a vanilla Ubuntu docker image. This image is being ran on our EC2 instances in a time critical setting.
Recently we've tried the new EC2 c5 instance type, that is supposed to be Intel Skylake gen CPUs. Very strangely, the same docker image on the new C5s produces significantly worse results time-wise. 3X slower in the median.
Ideas on why that might be the case?
Still holds true when compiling xgboost with -march=skylake-avx512