2

I'm trying to populate a model with, say, 1000 turtles. Each turtle holds three variables: sex, income, education. I'd like to assign the values of these variables according to some probabilities. E.g. -48% chance of a woman, 52% man -33% chance of an income below 100000, 20% chance of income between 100000 and 200000, xx% chance of, etc. -20% chance of no education, 7% chance of a phd, yy chance of zz education etc.

The probabilities will be provided via a csv-file or the user-interface. Later I'll make up rules for, how these people vote (based on statistics). But for now, I just need to create a population with the right values.

I've tried using "ask n-of" with some conditions, but as the number of variables and possible values grow, it gets ... complicated.

I've also tried "rnd:weighted-n-of" and similar, but I can't seem to get my head around it (I haven't done statistics in 25 years) :-)

Any ideas?

All the best, Palle

pnowack
  • 101
  • 6
  • Have you read Nicolas Payettes answer [here?](http://stackoverflow.com/questions/41901313/netlogo-assign-variable-using-probabilities) It has a brief overview of how you could use the `rnd` extension as you describe and may serve as a refresher/clarification. – Luke C May 16 '17 at 19:13
  • Thanks. Fantastic explanation by Nicolas - all the other ones, I found, I didn't get - partly because the syntax for anon procedures apparently has changed. – pnowack May 17 '17 at 06:01

1 Answers1

5

ask-n-of is the correct approach, you don't need weighted randomisation. If you know that there will always be 1000 turtles and you want exactly 48% to be male, then you want code like:

ask turtles [set sex "female"]
ask-n-of 480 turtles [set sex "male"]

That is, you have to set them all to one category first and then set some to the other category. However, this only works when you have two categories, because if you ask for 50% to be male and then ask for 50% to be female, each ask-n-of is a random draw from the entire population. What you probably want is something like this:

ask turtles
[ let choose-income random-float 1
  if choose-income < 0.5 [ set income 50000 ]
  if choose-income >= 0.5 and choose-income < 0.8 [ set income 100000 ]
  if choose-income >= 0.8 [ set income 150000 ]
]

So what you are doing is breaking the interval from 0 to 1 into sections with probabilities equivalent to the length of the section. So the above code will get you 50% chance of 50000, 30% chance of 100000 and 20% chance of 150000.

Note that I would actually use if-else or ifelse-value if I was coding this in my model. However, that would require nested if/else blocks, which is very hard to read.

JenB
  • 17,620
  • 2
  • 17
  • 45
  • Thanks a lot. My problem is exactly that each ask-n-of is a draw from the entire population. I see how you fixed it. Your comments about if-else, I guess would be for performance. But I need it to be easy to read for novices. – pnowack May 17 '17 at 06:05
  • See the [`cf` extension](https://ccl.northwestern.edu/netlogo/docs/cf.html) (now bundled with NetLogo) for an alternative to nested `ifelse`s. – Nicolas Payette May 17 '17 at 08:37
  • Excellent. Wasn't aware of that and we've been looking for something like that. Thanks! – pnowack May 17 '17 at 08:42
  • how did I miss the cf extension? That's excellent news. Yes, nested if-else is for performance but your question was about the logic more than the code so I figured readable was better :) – JenB May 17 '17 at 20:03
  • Exactly. Thanks again :-) – pnowack May 22 '17 at 06:01