DolphinDB: Calculate mwavg for each stock by using "context by" or "for loop"?

Question

Please help compare the performance between these two methods. How many times will the difference be and what are the causes?

Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. — Community, Dec 02 '21 at 15:09

score 0 · Answer 1 · answered Nov 29 '21 at 08:05

Suppose we use the following data to perform the calculations:

login(`admin, `123456)
pnodeRun(clearAllCache)
undef all

syms = format(1..3000, "SH000000")
N = 10000
t = cj(table(syms as symbol), table(rand(100.0, N) as price, rand(10000, N) as volume))

Method 1: calculating by using context by takes about 3.3 seconds.

timer result1 = select mwavg(price, volume, 4) from t context by symbol

Method 2: calculating by using for loop takes about 25 minutes.

arr = array(ANY, syms.size())
timer {
    for(i in 0 : syms.size()) {
        price_vec = exec price from t where symbol = syms[i]
        volume_vec = exec volume from t where symbol = syms[i]
        arr[i] = mwavg(price_vec, volume_vec, 4)
    }
    res = reduce(join, arr)
}

The performance difference between these two methods is about 400 times. The function context by groups all stocks at once, and then calculates each group separately. When using for loop, the entire table will be scanned to retrieve the corresponding 10000 records of one certain stock for each loop, which takes a longer time.

DolphinDB: Calculate mwavg for each stock by using "context by" or "for loop"?

1 Answers1