2

I am working on the comparison of Histogram of oriented gradient (HoG) and Convolutional Neural Network (CNN) for the weed detection. I have two datasets of two different weeds.
CNN architecture is 3 layer network.

1) 1st dataset contains two classes and have 18 images. The dataset is increased using data augmentation (rotation, adding noise, illumination changes) enter image description here

Using the CNN I am getting a testing accuracy of 77% and for HoG with SVM 78%.

2) Second dataset contact leaves of two different plants. each class contain 2500 images without data augmentation.
enter image description here

For this dataset, using CNN I am getting a test accuracy of 94% and for HoG with SVM 80%.

My question is Why I am getting higher accuracy for HoG using first dataset? CNN should be much better than HoG.

The only reason comes to my mind is the first data has only 18 images and less diverse as compare to the 2nd dataset. is it correct?

Addee
  • 663
  • 10
  • 21

1 Answers1

1

Yes, your intuition is right, having this small data set (just 18 images before data augmentation) can cause the worse performance. In general, for CNNs you usually need at least thousands of images. SVMs do not perform that bad because of the regularization (that you most probably use) and because of the probably much lower number of parameters the model has. There are ways how to regularize deep nets, e.g., with your first data set you might want to give dropout a try, but I would rather try to acquire a substantially larger data set.

tsh
  • 2,275
  • 1
  • 14
  • 18
  • I have updated my result for the 2nd dataset it 80% not 60%. – Addee May 23 '17 at 23:37
  • That is a big change. Are you sure your other numbers are correct? Nonetheless the answer still holds true, it is most probably the lack of regularization in the CNN that causes the drop in performance for the first dataset. – tsh May 24 '17 at 08:16
  • yes I have checked the other numbers and its correct. I have added dropout in the CNN architecture but still the performance was same. One thing the training accuracy is 77%, which is quite low. what does this low accuracy means? – Addee May 25 '17 at 06:41
  • Ok, now your are getting more or less the same test accuracy for the two methods which is most probably the maximum you can get with this few samples. Given that your CNN is complex enough you could try to train for more iterations. Maybe that improves your test accuracy but I would rather expect it not to. If you are really interested in automating this classification task, the best thing you can do is to collect more data. – tsh May 29 '17 at 12:27