0

I'm trying to participate in a challenge for classifying dashboard camera images (for car) with labels being -traffic light red / green / non-existent. Traffic lights are small part of the image, and no bounding box is supplied.

I'm trying to fine-tune the image as suggested here currently with the Inception net, but getting 0.55-0.6 accuracy. Need to achieve 0.95+.

I think that the network is not performing well because of the small portion of the traffic light in the image.

How can I make better progress with this?

Alon Burg
  • 2,500
  • 2
  • 27
  • 32

1 Answers1

0

I suggest instead of using the entire image at once, take crops of the image with a sliding window with overlap. You need to label the crops as well.

Harsh Wardhan
  • 2,110
  • 10
  • 36
  • 51
  • doesn't that mean that I need some dataset of only traffic light to train on first? or more specifically - a dataset of red/green/street background (i.e: no traffic light) – Alon Burg Dec 06 '16 at 19:16
  • How are you going to train without any labeled data, anyway? – Harsh Wardhan Dec 06 '16 at 19:20
  • I do have labeled data of the dashcam images with just labels of red/green/none... no bounding box – Alon Burg Dec 06 '16 at 19:35
  • So when you create the crops during training, you need to label them as well. This does involve a lot of manual work but right now, I can't think of a better way. – Harsh Wardhan Dec 06 '16 at 19:40