Questions tagged [weighted-average]

The weighted average or weighted mean is similar to an arithmetic mean where instead of each of the data points contributing equally to the final average, some data points contribute more than others.

If all the weights are equal, then the weighted mean is the same as the arithmetic mean.

The mathematical expression for the weighted average is

enter image description here

Read more here

476 questions
2
votes
1 answer

Interpreting strange Bland-Altman plot in R

I'm trying to use Bland-Altman (Tukey Mean Difference) plots to assess how the impute.knn() function from the impute package affects our results for a number of CpG's (cg16181396 is the example here) and I'm not sure how to interpret the…
2
votes
2 answers

Aggregate list of dictionary in bins by applying weighted average in Python

I have a list of dictionaries, which looks like this: _input = [{'cumulated_quantity': 30, 'price': 7000, 'quantity': 30}, {'cumulated_quantity': 80, 'price': 7002, 'quantity': 50}, {'cumulated_quantity': 130, 'price': 7010,…
Julien
  • 35
  • 1
  • 6
2
votes
2 answers

Applying the Python Pandas Exponential Weighted Average in Reverse Order

For a pandas.DataFrame example: In: cols = ['cols1', 'cols2'] In: df = pd.DataFrame({'col1': [1, 2, 3, 4], 'col2': [3, 4, 5, 6]}) Out: col1 col2 0 1 3 1 2 4 2 3 5 3 4 6 I am using the…
kel
  • 113
  • 1
  • 1
  • 6
2
votes
0 answers

PySpark applying UDF for exponential weighted mean from collect_list array

I'm ultimately hoping to recover similar functionality to that detailed in Pyspark SPARK-22239, which will enable the use of window functions with Pandas user-defined functions. Specifically, I'm performing a timestamp-based windowing of underlying…
2
votes
1 answer

ggplot: weighted.mean and stat_summary in a facetted bar plot

I've spent too much time trying to figure out a solution for including weighted.mean (or wtd.mean) into stat_summary and make it work properly. I've looked to several pages trying to tackle the same issue but none had a definitive solution. The main…
HariSeldon
  • 33
  • 1
  • 6
2
votes
0 answers

How to make a weight array from a list of tuples in order to plot a histogram whose y axis is weighted?

I have a list of tuples as follows: A=[(122208102.23250552, 34), (164096757.6449624, 4), (212275562.3177331, 72), (499344188.7213493, 240), (515347294.02090293, 2), (614044718.1056056, 4), (623878472.271997, 37), (1050993427.1862154, 2),…
Rebel
  • 472
  • 8
  • 25
2
votes
2 answers

Joining data with weighted averages and multiple weights in R

So I had this question but the scope got a little larger/more complicated. Basically I want to combine two tables and calculate the weighted average for any duplicate IDs. The problem is I will have multiple sets of columns that will need to use…
2
votes
1 answer

weighted.mean command in r works strange

Suppose I have these vectors: q1<-c(9,8,10,9,3,2,1,2,4,5) q2<-c(9,7,8,6,5,4,8,7,8,9) q3<-c(0,0,0,5,9,5,9,5,0,5) I want to compute a weighted mean based on a weights vector like…
2
votes
1 answer

Custom function using multiple parameters applied to every column in dataframe

I have a df that looks like this data = [{'Stock': 'Apple', 'Weight': 0.2, 'Price': 101.99, 'Beta': 1.1}, {'Stock': 'MCSFT', 'Weight': 0.1, 'Price': 143.12, 'Beta': 0.9}, {'Stock': 'WARNER','Weight': 0.15,'Price': 76.12, 'Beta':…
ThatQuantDude
  • 759
  • 1
  • 9
  • 26
2
votes
2 answers

Multiple response analysis in weighted survey data using srvyr

I'm trying to analyse a multiple response question from a weighted survey dataset. I like the srvyr package because it allows me to use the dplyr pipes, but I can't find the reference material on how to handle multiple response questions. I have a…
Gianzen
  • 73
  • 4
2
votes
1 answer

Aggregate and Weighted Mean for multiple columns in R

The question is basically the samt as this: Aggregate and Weighted Mean in R. But i want it to compute it on several columns, using data.table, as I have millions of rows. So something like this: set.seed(42) # fix seed so that you get the same…
Jeppe Olsen
  • 968
  • 8
  • 19
2
votes
2 answers

Calculate a moving average in R, on a rolling subset of a time series

I have a daily time series and I'm trying to calculate a 10 period moving average on it. The trouble I'm having is that the moving average needs to be on a rolling subset of the data (the 10 periods are not contiguous). I need an average over the…
DM2017
  • 21
  • 4
2
votes
1 answer

Pandas: filling missing values by weighted average in each group

I have a dataFrame where 'value'column has missing values. I'd like to filling missing values by weighted average within each 'name' group. There was post on how to fill the missing values by simple average in each group but not weighted average.…
Chao Chen
  • 23
  • 4
2
votes
1 answer

R: Cumulative weighted mean in data.table

Basis is the following data table: library(data.table) dt <- data.table(Position = 1:3, Price = c(50, 45, 40), Volume = c(10, 10, 10)) dt Position Price Volume 1: 1 50 10 2: 2 45 10 3: 3 40 10 Now I…
schluk5
  • 177
  • 2
  • 13
2
votes
1 answer

R - Vectorized Mean with Weighting

I am currently able to rapidly calculate the mean of a dataset I have that is several million entries using the following code : PosAvg = mean( curTweets$posScore[curTweets$posScore > 1]) uniqPosTweets = curTweets[ curTweets$posScore >…
Jibril
  • 967
  • 2
  • 11
  • 29