-1

Here is a part of mydata:

The raw data is very big, so I upload a part of it with 20 rows.

x <- [7.6,2.2,1.1,4.7,8.6,7.5,7.5,29.9,5.0,3.0,2.4,1.5,14.9,3.9,3.7,3.2,5.0,1.7,2.9,2.3]

Here is function description

  1. power law: y=A*x^-(u)
  2. exponential: y=B*exp^(-βx)

Now, I want to use MLE(Maximum likelihood method) to get u for power law, and β for exponential distribution.

#set likelihood function of power law
pl <- function(u){-n*log(u-1)-n*(u-1)*log(min(x))+u*sum(log(x))}

#set likelihood function of exponential distribution
ex <- function(β){-n*log(β)+β*sum(x)}

Are these functions right?

Using mle2() to get the parameters:

#get the parameter u of power law
s1 <- mle2(pl,start = list(u=2),data = list(x))
summary(s1)
#get the parameter lamda of exponential distribution
s2 <- mle2(ex,start = list(β=2),data = list(x))
summary(s2)

Now here are two questions:

  1. how to determine which one is the best to fit the model

    Using confint() could get the 95% CI, how do I get the Rsquared and AIC(Akaike weigths) of both model?

  2. After I get which one is best to fit the data, How do I draw the fitted graph above the raw data?

I use R.3.2.2 in windows 7.

Ling Zhang
  • 281
  • 1
  • 3
  • 13

1 Answers1

0

Pretty much as you might expect. You haven't specified the conditional distribution of your data, so I'm going to assume Normality. (Given this, you could also use nls() -- least-squares is maximum likelihood estimation for a Normal, homoscedastic response), although mle2 offers a little more scope for playing with optimizers etc.)

I'm going to use the formula interface, which is convenient if your model isn't too complicated.

power law

mle2(y~dnorm(mean=A*x^(-mu),sd=exp(logsd),
     start=list(A=...,mu=...,logsd=...),
     ## no need for list() if mydata is already data.frame
     data=mydata)   

exponential

  mle2(y~dnorm(mean=B*exp(-beta*x),sd=exp(logsd),
     start=list(B=...,beta=...,logsd=...),
     data=mydata)  

... where the elements in start are any reasonable starting values. Given your data above, these approaches ought to work reasonably on a subset of the data. However, they might not perform well on 10 million observations. I would consider using

glm(y~x,family=gaussian(link="log"),data=mydata)

to fit the exponential curve and

glm(y~log(x),family=gaussian(link="log"),data=mydata)

to fit the power-law curve.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • x=[0.8,0.9,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9] – Ling Zhang Dec 02 '15 at 03:10
  • x=[0.8,0.9,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9], my data is something like that but the length of mydata is 10 million. And, one more question, A, mu and logsd could be any number I input in the start list? – Ling Zhang Dec 02 '15 at 03:15
  • you might think mydata is data.frame with 2 columns named x and y? – Ling Zhang Dec 02 '15 at 03:27
  • could you please edit your question to include this information? – Ben Bolker Dec 02 '15 at 04:31
  • Sorry for my poor description of mydata, I have uploaded a part of mydata in the question description, please have an another look on it, Thx! – Ling Zhang Dec 02 '15 at 04:31
  • I am sorry for my unclear description of mydata, now I upload mydata into my question, please check it out, Thank you sincerely – Ling Zhang Dec 02 '15 at 04:38
  • I sent an email to your mailbox, please check it out. In the email, there are some ideas I want to share and discuss with you, Thank you – Ling Zhang Dec 03 '15 at 03:19
  • I'm sorry, I don't do off-line consultation. – Ben Bolker Dec 03 '15 at 03:32
  • Never Mind. Another question, how to get Rsquare, AIC(Akaike weight) for both model, I want to know which model is best to fit mydata, and I know function confint() to get 95% CI – Ling Zhang Dec 03 '15 at 10:17
  • Sir, I have updated my questions more clearly, please take an another look on it, sorry for bothering you again @Ben Bolker – Ling Zhang Dec 03 '15 at 10:40