We want to tell whether an image is good or bad.
There are a fixed set of checks we do to classify an image into good or bad category.
Example:
1. Background color.
2. Height X Width ratio.
3. No water marks.
In general, we want only GOOD images. We fetch these images from websites and perform operations to validate images of that website.
As of now, we go to the website, try to get the normal images (say Product images from E-commerce websites by excluding common images across all pages). There is an alternative in terms of visiting Google with search parameter "site:website name"
, it reduces our effort of identifying images.
I haven't tried/used color histogram
approach.
What would be the better approach for this problem? Any research papers (or open source libraries like Mahout) which would be easy to implement will also be useful.