0

I'm doing a failure analysis, for which I like to try some different scenarios and some random trials. So far I've done this with the mosaic package and its working out great. In one specific scenario I want to generate a vector of (semi)random numbers with from different distributions. No problem so far.

Now I want to have defined number of negative numbers in this vector. For example I want to have between 0-5 negative numbers in the vector of 25 numbers. I thought I could use something like rbinom(n=25,prob=5/25,size=1) to get 5 random ones first but of course 5/25, 25 times can be more than 5 ones. This seems a dead end. I could get it done with some for loops, but probably something easier exists. I've tried all sorts of sample,seq, shuffle combinations but I cannot get it to work so far.

does anyone have any ideas or suggestions?

user1549537
  • 161
  • 1
  • 5
  • 1
    What distribution will the absolute value of your numbers be from? Will they be integers? Continuous? – David Robinson Oct 08 '13 at 17:50
  • @user1549537 Hi, if any answer solves your problem can you click on "accept it" so that other people can see it? thanks – agenis Sep 06 '17 at 13:07

2 Answers2

3

If you have a vector x where all elements are >= 0, let's say drawn from Poisson:

x = rpois(25, lambda=3)

You can make a random 5 of the negative by doing

x * sample(rep(c(1, -1), c(length(x) - 5, 5)))

This works because

rep(c(1, -1), c(length(x) - 5, 5))

will be

#  [1]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 -1 -1 -1 -1 -1

and sample(rep(c(1, -1), c(length(x) - 5, 5))) simply shuffles them up randomly:

sample(rep(c(1, -1), c(length(x) - 5, 5)))
#  [1]  1  1 -1  1  1  1  1  1  1  1  1 -1  1  1  1 -1 -1  1  1  1 -1  1  1  1  1
David Robinson
  • 77,383
  • 16
  • 167
  • 187
  • 1
    It's much harder if the distributional properties extend over entire 25-vectors, some entries of which have negative numbers, as opposed to drawing a 25-vector then randomly making 5 entries negative. It's similar with things like sampling a binary matrix that has fixed row and column sums. Clamping on one thing first makes it much easier, but seldom satisfies the needed distributional properties that are sought after. This is not a knock against your approach, just a comment that in general one has to go for MCMC-type methods to simultaneously satisfy many distributional properties. – ely Oct 08 '13 at 17:55
0

I can suggest a very straightforward solution, guaranteeing 5 negative values and working for any continuous distribution. The idea is just to sort the vector and substract the 6th biggest to each value:

x <- rnorm(25)
res <- sort(x, T)[6] - x
#### [1]  0.4956991  1.5799885  2.4207497  1.1639569  0.2161187  0.2443917 -0.4942884 -0.2627706  1.5188197
#### [10]  0.0000000  1.6081025  1.4922573  1.4828059  0.3320079  0.3552913 -0.6435770 -0.3106201  1.5074491
#### [19]  0.6042724  0.3707655 -0.2624150  1.1671077  2.4679686  1.0024573  0.2453597
sum(res<0)
#### [1] 5

It also works for discrete distributions but only if there are no ties..

agenis
  • 8,069
  • 5
  • 53
  • 102