I'm trying to develop a program in Python that recognizes the flowchart image files. Result should be: yes this is a flowchart
or no this is not a flowchart
.
I have watched a video series that classifies dog and cat images. There are two categories as a dataset, dogs and cats. But I only have one category flowcharts
. How can I seperate flowchart images from all other things?

- 31
- 1
- 9

- 3
- 1
-
Create a second category: "not a flowchart" – neuhaus Jul 16 '19 at 14:22
-
You can do it with a single target column named "isFlowchart" and depending on the result you get from your neural network, you can use sigmoid function to decide whether that is good enough to classify that image as Flowchart. – recepinanc Jul 16 '19 at 14:29
-
If you search in your browser for "one-class classification", you'll find references that can explain this much better than we can manage here. – Prune Jul 16 '19 at 17:51
2 Answers
Well in both cases, you have two classes: Cat / Dog and Flowchart / Not a flowchart, so you could try to apply the same principles with these two classes.
To detect flowcharts, you could also try to identify patterns, like lines, rectangles or text in the image, that are characteristic in flowcharts.
This could lead to better results and would not require a huge training dataset like you would with deep learning. This is a subject too wide to give a complete answer here, but I'd encourage you to go in that direction.

- 86
- 6
This is a hard problem to solve, because the problem space is so large. You basically have two possible classifications -- "flowchart" and "not flowchart". The hard part is "not flowchart". You would need a huge training dataset of images that are not flowcharts in order to achieve even decent results. On one hand, it's easy to acquire such a training dataset, because you just need a bunch of random images. On the other hand, this would require a lot of time to train, would take up a lot of storage space, and you still might not achieve the accuracy that you're looking for.