0

For my bachelor thesis I am currently working with neural networks in finance with R. In one part of my thesis I want to explain why it can be important to average the output of neural networks.

I have written the following code.

num_nn <- 10000
pred.var <- 5
explain <- c(1:10)
target <- explain^2
df <- data.frame(explain = explain, target = target)
result_df <- data.frame(matrix(NA, ncol = 1, nrow = num_nn))
colnames(result_df) <- c("pred")

for (i in 1:num_nn){
  nn <- neuralnet(target~explain, data = df, hidden = c(32,16), stepmax = 1e7, lifesign = "minimal")
  pred <- predict(nn, as.data.frame(pred.var))
  
  result_df[i, "pred"] <- pred
}

df.mean <- data.frame(matrix(NA,ncol = 1, nrow = num_nn))
colnames(df.mean) <- c("predictet_mean")

for (j in 1:num_nn){
  x <- sample(result_df$pred, j, replace = F)
  df.mean[j, "predictet_mean"] <- mean(x) 
}

The example should be very simple. The target variable should simply be the explanatory variable in the square.

Then the number 5 should be predicated. I have done the whole thing for 10000 nets via a loop. And stored in a dataframe. Since I work with neural networks, the output is not always the same. As it is for example with a linear regression.

Then I sample a number from this dataframe and calculate the mean. Then I take two numbers and calculate the mean. And so on until I take 10000 numbers to calculate the mean. When I plotted the whole thing, you can see that at the beginning the mean values have a big scatter. The more predict values I take to calculate the mean the better it gets. The numbers converge to one number. This number should be 25. But it converges to the number 25.00157.

If someone wants to try the whole thing himself, it is recommended to test it with 300 nets and not 10000 like me ;) Just change the variable num_nn to 300.

My question to you professionals: why doesn't it converge to 25? Is this a numerical problem in R?

noahae
  • 1

0 Answers0