Your problem is the starting values of the parameters and the log likelihood
logl <--N/2* log(2*pi*sigma2)-1/(2*sigma2)* sum((beta0+beta1*x1-y)^2)
This breaks down if sigma2
becomes zero for multiple reasons: you take the log zero (not nice) and divide by zero (not at all nice). In other words you need to ensure that your sigma2 is positive.
One simple (and classical) way of doing exactly that is to exp()
the parameter input, so replace
sigma2 <- theta[3]# x contains our covariates
with
sigma2 <- exp(theta[3])# x contains our covariates
This allows the optim()
function to use the full range of values but transforms the real line to the positive line so the likelihood makes sense. Just remember that the parameter estimate for sigma2
returned by the optim()
function will be the logged value.
Update with full code and output
The full code (with the change shown above) is
lm.loglik3 <-function(y, x, theta){
N <-length(y)# theta contains our parameters to be estimated
beta0 <- theta[1]
beta1 <- theta[2]
sigma2 <- exp(theta[3]) # x contains our covariates
x1 <- x
logl <- -N/2* log(2*pi*sigma2)-1/(2*sigma2)* sum((beta0+beta1*x1-y)^2)
return(-logl)
}
When initialized with
stval <-c(0, 0, 0)
X <- c(8, 2, 4, 1, 1, 13, 2, 9, 2, 1, 7, 2, 1, 2, 4, 8, 3, 2, 2, 2,
3, 1, 14, 2, 1, 5, 2, 2, 4, 5)
Y <- c(18, 17, 13, 13, 20, 11, 11, 9, 8, 14, 8, 7, 15, 13, 12, 17,
10, 16, 14, 10, 13, 10, 12, 16, 18, 9, 9, 15, 13, 18)
we get
> res3
$par
[1] 13.4598331 -0.1292182 2.4549731
$value
[1] 79.3809
$counts
function gradient
178 NA
$convergence
[1] 0
$message
NULL
$hessian
[,1] [,2] [,3]
[1,] 2.57596526 9.8745335 0.00558974
[2,] 9.87453350 67.7478863 0.00621170
[3,] 0.00558974 0.0062117 14.98815391