0

I have one vector x, and n vectors Y (n>=10,000,000)

Each vector is of size 4000

Now it needs to get corr(x, yi), obviously the result is of size n

If each corr is calculated one by one, it takes a lot of time and can not be finished in 1 min. Matlab provides an efficient way to get corr matrix for nxn matrix, but in my laptop the n is limited <20000.

So is there a way or tool to complete this task within 1 min?

whogiawho
  • 466
  • 9
  • 21
  • Which language do you use for that? Matlab? If so please add the associated tag to the question. You are computing a dataset of 10,000,000*4,000 = 40,000,000,000 items that should take 320 GB in RAM... This is HUGE (especially if you use Matlab due to the GC). This should not work on most machines (including most computing server) due to the lack of memory or be insanely slow due to the swap being used (which will kill the SSD quickly if you do that). AFAIK, no laptop have such amount of RAM. Why do you want to do that? – Jérôme Richard May 24 '22 at 10:47
  • If the corr is calculated one by one(or by block), there is no need for 320GB ram. If Matlab provides a way to solve it, python|java or other lanuages may be used to call its interfaces. Lanuage does not make matter if it can be used to solve the problem. – whogiawho May 24 '22 at 23:50
  • You need to read the input from somewhere and write it somewhere. If the input is generated and the output is directly consume, then the problem is completely different. If the input and output are read/stored from storage devices, then this is again an entirely different problem. This is especially true on GPUs (the question is tagged gpu). If you want to solve something efficiently, then you need to be much more specific. Details are critical for high-performance. – Jérôme Richard May 25 '22 at 00:11

0 Answers0