Results of KPCA are different for different number of CPUs the code runs on

Question

I am using KPCA function of kernlab package for dimensionality reduction, I am using rpy2 to call the API from python. The problem is I am getting different output for same data when running the my python script on different number of CPU cores each time. I am using linux command "taskset" or "numactl" to run my script from terminal. For example, for 2 runs:

taskset -c 1-3 python run.py
taskset -c 1-5 python run.py

The output of above two runs would be completely different. While each of them are reproducible in itself, like it I run with 3 cores as in the 1st command, 10 times..the output will be same for all 10 times.. similarly for the 2nd command with 5 cores. But why are their outputs are different from each other? This becomes an issue since its impacting my classification performance.

Edit: Indeed I also noticed this exact same behaviour when using scikit learn kpca. Is there anything common and fundamental here regarding KPCA that I am missing ?

Please help.

`kpca` is a CRAN package I think. It looks like an R question to me, the thin Python wrapper might not matter at all, but @pranay25 should try to make an MCE by demonstrating the same with a demo dataset using R alone. — krassowski, Jul 21 '21 at 14:31
@Wimpel I am trying to use an R function here, which is "kpca" belonging to the package "kernlab" in R. The only python here is that I am calling it from a python environment using python code.. Library rpy2 enables us to call python and R libraries from one another. The behaviour I am seeing is specific to R library kpca that's why I tagged it to r. — pranay25, Jul 23 '21 at 12:27
@krassowski yes its an R questions, I am sorry have I used a wrong tag ? I beleive I used an R one, let me know if I used wrong tag, i will re post it with right tag. Sure I will try to give an MCE. Thanks. — pranay25, Jul 23 '21 at 12:29

Results of KPCA are different for different number of CPUs the code runs on

0 Answers0