0

I have a table which has an elasticity column. To each of the records, I want to assign a new elasticity value. That value is based on performing a sampling assuming uniform distribution. For eg, lets say I have 4 records with elasticity values (1.2, 1.3, 1.4, 1.5). So I take a sample of these 4 values 50 times, after which I have a matrix of 4X50. How do I assign the value that came up the most to the record?

num_vals_to_sample = sum(measurement_Elasticity); #Counts the no of records


Sampled_measurement_Elasticity = replicate(50, sample(measurement_Elasticity, num_vals_to_sample, replace = TRUE))

In the above code, I want a new measurement_Elasticity vector which has the value that came up the most during the sampling process.

Using Henry's code, I solved my problem this way:

num_vals_to_sample = sum(measurement_Elasticity);


New_measurement_Elasticity = c()

#Elasticity Sampling

for (i in 1:num_vals_to_sample)
{

  Sampled_measurement_Elasticity <- table(sample(measurement_Elasticity), 100, replace=TRUE))

  Most_Likely_Elas =as.numeric(names(Sampled_measurement_Elasticity)[max(which(Sampled_measurement_Elasticity==max(Sampled_measurement_Elasticity)))])

  append(New_measurement_Elasticity, Most_Likely_Elas)
}
Cyang
  • 379
  • 1
  • 8
  • 18
  • Out of interest, what's the application for this? It seems to be equivalent to just picking one number at random, unless the sampled values are used for something else that requires you to know the mode? – ping May 05 '14 at 17:58

1 Answers1

2

You might want to consider this as a possibility

> set.seed(5)
> examplecounts <- table(sample(c(1.2, 1.3, 1.4, 1.5), 50, replace=TRUE))
> examplecounts
1.2 1.3 1.4 1.5 
 13  13  11  13 
> names(examplecounts)[which(examplecounts == max(examplecounts))]
[1] "1.2" "1.3" "1.5"
> as.numeric(names(examplecounts)[min(which(examplecounts==max(examplecounts)))])
[1] 1.2

Usually you will get a single value: try changing the seed.

Henry
  • 6,704
  • 2
  • 23
  • 39
  • This could work. However, I'm wondering what approach to take when you get multiple values. What is the most popular practice? Taking a mean of all the values? For eg, since I want ONE value which came up the most but there are several, do I just take the mean of those? – Cyang May 05 '14 at 16:25
  • It depends on your needs where you have multiple modes. The mean of the modes is unlikely to be a good choice as it will often not be any of the modes. My final example line chooses the minimum mode, but this is arbitrary, and the maximum is an easy alternative. – Henry May 05 '14 at 16:28
  • Your code is showing the process for generating one record. How do I generate values for 'x' no of records? – Cyang May 05 '14 at 16:28
  • I do not understand the question, but perhaps looping would be an answer – Henry May 05 '14 at 16:31
  • What I mean is, I want the result to be a single list of elasticity values. If I had 8 rows in my table, then I want a list of 8 elasticity values – Cyang May 05 '14 at 16:44