I currently have an algorithm for finding the standard deviation on a cluster of machines where one node will request the whole data set from other nodes across a network and run the standard deviation calculation over the data once it is received.
What I would like is to process the data independently on each node, then send the result of that to the requesting node which will merge the results. This will reduce network traffic and calculate the results in parallel.
The question is if there is an algorithm that can do this, or if all standard deviation calculations rely on the whole result that has been processed so far.