3

I need to rescale a series of numbers with certain constraints.

Let's say I have a vector like this:

x <- c(0.5, 0.3, 0.6, 0.4, 0.9, 0.1, 0.2, 0.3, 0.6)

  1. The sum of x must be 6. Right now the sum of x = 3.9.
  2. The numbers cannot be lower than 0
  3. The numbers cannot be higher than 1

I know how to do 1 and 2+3 separately, but not together. How do I rescale this?

EDIT: As was tried by r2evans, preferably the relative relationships of the numbers is preserved

Rene
  • 363
  • 2
  • 7
  • 1
    For your constraint 2, is there an expectation of the actual minimum? That is, if all numbers are intentionally above (say) 0.5, is that a problem? – r2evans Aug 24 '21 at 18:26
  • preferably the relative relationships of the number is preserved, as was highlighted by @r2evans – Rene Aug 25 '21 at 07:00

2 Answers2

4

I don't know that this can be done with a simple expression, but we can optimize our way through it:

opt <- optimize(function(z) abs(6 - sum( z + (1-z) * (x - min(x)) / diff(range(x)) )),
                lower=0, upper=1)
opt
# $minimum
# [1] 0.2380955
# $objective
# [1] 1.257898e-06
out <- ( opt$minimum + (1-opt$minimum) * (x - min(x)) / diff(range(x)) )
out
#  [1] 0.6190477 0.4285716 0.7142858 0.5238097 1.0000000 0.2380955 0.3333335 0.4285716 0.7142858 1.0000000
sum(out)
# [1] 6.000001

Because that is note perfectly 6, we can do one more step to safeguard it:

out <- out * 6/sum(out)
out
# [1] 0.6190476 0.4285715 0.7142857 0.5238096 0.9999998 0.2380954 0.3333335 0.4285715 0.7142857 0.9999998
sum(out)
# [1] 6

This process preserves the relative relationships of the numbers. If there are more "low" numbers than "high" numbers, scaling so that the sum is 6 will bring the higher numbers above 1. To compensate for that, we shift the lower-end (z in my code), so that all numbers are nudged up a little (but the lower numbers will be nudged up proportionately more).

The results should always be that the numbers are in [opt$minimum,1], and the sum will be 6.

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • If I start with `x <- c(x, 0.9)`, it produces numbers in `[0,1]` and they sum to 6. I think it will fail when the length of `x` is too short, see my recent edit to admit that. – r2evans Aug 24 '21 at 18:15
  • Oh right ... missed that when I wrote my last comment, good catch – r2evans Aug 24 '21 at 18:22
  • great. I used a 4 parameter model to optim. This is better. I wonder if using `^2` rather than `abs` will get closer? – user20650 Aug 24 '21 at 18:52
  • 1
    Quite possibly. However, if by "closer" you mean "still not always perfect", then the last clean-up step `out * 6/sum(out)` may still be justified. So I leaned towards "simple". – r2evans Aug 24 '21 at 18:53
  • It doesn't work out with another series of numbers. If I use tmp <- rnorm(10, .5, .1) and I want them to add up to 30, the final sum is almost 10. The series rnorm produced is: # [1] 0.4560621 0.6843085 0.4216595 0.4523113 0.4337446 0.3613095 0.4945376 0.4733079 0.5710565 0.4900130 – Rene Aug 25 '21 at 06:59
  • 2
    @Rene; how can 10 numbers from [0,1] sum up to 30? – user20650 Aug 25 '21 at 08:18
  • Rene, have you found another example where this appears to not work? – r2evans Aug 25 '21 at 14:43
1

Should be possible with a while loop to increase the values of x (to an upper limit of 1)

x <- c(0.5, 0.3, 0.6, 0.4, 0.9, 0.1, 0.2, 0.3, 0.6)

current_sum = sum(x)

target_sum = 6

while (!current_sum == target_sum) {
  print(current_sum)

  perc_diff <- (target_sum - current_sum) / target_sum
  
  x <- x * (1 + perc_diff)
  
  x[which(x > 1)] <- 1
  
  current_sum = sum(x)
}

x <- c(0.833333333333333, 0.5, 1, 0.666666666666667, 1, 0.166666666666667, 
0.333333333333333, 0.5, 1)

There is likely a more mathematical way

Jamie_B
  • 299
  • 1
  • 5
  • This actually works, although it doesn't preserve the relationships of the numbers. – Rene Aug 25 '21 at 07:13