Random Pareto distribution in R with 30% of values being <= specified amount

Question

Let me begin by saying this is a class assignment for an intro to R course.

First, in VGAM why are there dparetoI, ParetoI, pparetoI, qparetoI & rparetoI?
Are they not the same things?

My problem: I would like to generate 50 random numbers in a pareto distribution. I would like the range to be 1 – 60 but I also need to have 30% of the values <= 4.

Using VGAM I have tried a variety of functions and combinations of pareto from what I could find in documentation as well as a few things online.

I experimented with fit, quantiles and forcing a sequence from examples I found but I'm new and didn't make much sense of it.

I’ve been using this:

alpha <- 1   # location 
k <- 2 # shape     
mySteps <- rpareto(50,alpha,k)
range(mySteps)
str(mySteps[mySteps <= 4])

After enough iterations, the range will be acceptable but entries <= 4 are never close.

So my questions are:

Am I using the right pareto function? If not, can you point me in the right direction? If so, do I just keep running it until the “right” data comes up?

Thanks for the guidance.

what level of stats do you know? This follows the same structure as the other distributions in r. `rnorm` generates random normals, `dnorm` for the density, `runinf` and `dunif` are the same for the uniform distribution, so yes, `rpareto` is the right one for random numbers draw from the Pareto distribution — rawr, Feb 15 '14 at 02:30
I took stats a very long time ago and it feels like I'm starting over. Thanks. Good to know I'm on the right track. — MMDG, Feb 15 '14 at 14:46

score 1 · Accepted Answer · edited Jun 20 '20 at 09:12

So reading the Wikipedia entry for Pareto Distribution, you can see that the CDF of the Pareto distribution is given by:

F_X(x) = 1 - (x_m/x)^α

The CDF gives the probability that X (your random variable) < x (a given value). You want Pareto distributions where

Prob(X < 4) ≡ F_X(4) = 0.3

or

0.3 = 1 - (x_m/4)^α

This defines a relation between x_m and α

x_m = 4 * (0.7)^1/α

In R code:

library(VGAM)
set.seed(1)
alpha <- 1
k     <- 4 * (0.7)^(1/alpha)
X     <- rpareto(50,k,alpha)
quantile(X,0.3)   # confirm that 30% are < 4
#      30% 
# 3.891941

Plot the histogram and the distribution

hist(X, breaks=c(1:60,Inf),xlim=c(1,60))
x <- 1:60
lines(x,dpareto(x,k,alpha), col="red")

If you repeat this process for different alpha, you will get different distribution functions, but in all cases ~30% of the sample will be < 4. The reason it is only approximately 30% is that you have a finite sample size (50).

Thanks! That will take me a while to digest. – MMDG Feb 15 '14 at 14:44 — MMDG, Feb 15 '14 at 14:44

Random Pareto distribution in R with 30% of values being <= specified amount

1 Answers1

Linked