I need to create a random sample from a table that has users and postings. Each user could have more than one posting. I need to select only 200 from a variable size of users (each day we will have a different total). I created a rand() variable and select only the items that have this rand() under
200/count(*)
But the problem is that I will might have users repeated. How can I select only 200 users from this variable total, considering the original distribution of users (ones are there more times, so I need to give them more chances to be selected)?
I was thinking of creating a loop that populates a field counting the user.... so I will have the same number for each user (right now I don't have a user id, instead of that I have a char field). But I'm not sure how to do this....
Thanks!