I am currently able to rapidly calculate the mean of a dataset I have that is several million entries using the following code :
PosAvg = mean( curTweets$posScore[curTweets$posScore > 1])
uniqPosTweets = curTweets[ curTweets$posScore > abs(curTweets$negScore) ,]
UniqPosAvg = mean( uniqPosTweets$posScore )
However, I want to weight these, and still keep the efficiency I have be doing this in the same style as above.
curTweets$posScore / curTweets$negScore can take a value of 1, 2, 3, 4, 5.
Let's say I want to give the following weights : 6,7,8,9,10 respectively. I'm using these numbers to just differentiate the from the potential values of posScore. Actual weights are calculated in my algorithm.
Is there a way to do this? I can't figure out how I would weight while maintaining this efficiency. Am I stuck having to loop through each entry and calculate contributions individually?
Thank you!