6

I am trying to classify a set of images into grayscale or color groups. I have been using ImageMagic to do that, comparing the color image to a grayscale version of itself and then using the Peak Error to determine if it is a grayscale image, as shown here:

http://www.imagemagick.org/Usage/compare/#type_general

This is working, however, it is producing false results for a lot of back and white images. I have tracked this issue down and I believe it is being caused by pseudo-black pixels (e.g.: rgb(0,0,5)). These psuedo black pixels appear as full color, because they technically are.

Is there a better way to classify these images?

What would you suggest I do to get rid of these false results?

Sergio R.
  • 85
  • 2

2 Answers2

3

Convert the images to another color space like HSV and than just check the S component. S stands for saturation and the saturation should be (nearly) 0 for all grayscale images. Here is the documentation from ImageMagick: http://www.imagemagick.org/script/color.php

Framester
  • 139
  • 3
  • 1
    I already tried this solution as well, and it is producing more false positives than the one I posted. I am trying to find a way to combine these two in order to get a more accurate classification. These two approaches fail in different ways, by the way. – Sergio R. Jan 20 '12 at 16:53
  • 1
    @SergioR. please give us a concrete example image and command where it fails so we can analyze why. – Ciro Santilli OurBigBook.com Sep 26 '15 at 09:14
2

Your problem is setting the margin of error too strictly for classifying an image as black and white. The trick is to not go too far the other way and falsely classify color images as black and white.

Do some research on Bayes classification. Basically, you do the classification manually for a sample of your images. That will give you the total probability of an image being black and white in your collection, as well as the probabilities of an average saturation in any given black and white image. Then, you use a formula to compare the probability of a new unknown image being a black and white image versus the probability of it being color, given its average saturation.

For example, say 10% of the images in your collection are black and white, 20% of black and white images have an average saturation of 5, but only 1% of color images have an average saturation of 5.

black and white posterior for saturation of 5 = 0.1 * 0.2 = 0.02
color posterior for saturation of 5 = 0.9 * 0.01 = 0.009

So you would classify an image with saturation of 5 as black and white.

The nice thing about Bayes classification is that you can use more than one factor in your determination, if you find another feature that you think is useful in addition to saturation. Maybe maximum saturation works better than average, or you can combine the two. Also, you don't have to guess at the right cutoff point, because it's based on your training set.

Karl Bielefeldt
  • 146,727
  • 38
  • 279
  • 479