0

I work for auto industry, where the reliability of machine inference is the critical issue becaue of lawsuit all that. The neural network(NN) is very much popular now, but how about reliability? They say, it was tested on 1000 tests data. Well that's not enough, how about 10000 or more? What can you say about untested or unseen data?

I don't only mean to raise a lack of data issue, but its black box nature of NN. Gaussian process is,I found "safer" since the output can be derived as some kind of distribution(although that depends on kernel you choose), and at least I know the unseen data will return the similar prediction as the similar seen data would. What about NN? any nice distribution of output? Can I safely assume to get continuous result from NN as input data changes? Thank you.

Similar topic How to prove the reliability of a predictive model to executives?

Community
  • 1
  • 1
JimSD
  • 135
  • 8

1 Answers1

0

It is hard to say how a neural network will perform on unseen data. Like you said, a neural network is a black box, but that doesn't always make it unreliable.

For a neural network to really see patterns and just overfitting the data, you need to implement dropout. This prevents overfitting and stimulates the detection of (large) patterns in the dataset.

Secondly, you should implement a test set. This will tell you how well a neural network performs on data that it isn't trained to. So if you have 1000 samples, you use 800 for training and 200 for testing. After training the network with those 800 samples, you test the network on the 200 unseen samples.

Additionally, you could generate some samples: make samples from which you expect a certain output.

But yes, it is hard to predict how the neural network will perform on 10000 other samples. For example, you don't know if those 10000 samples share the same pattern with the small 1000 samples you trained it on.

Thomas Wagenaar
  • 6,489
  • 5
  • 30
  • 73