2

Hello I am trying to eliminate all the orange tones of an image saved in a bitmap, I need to do OCR in the image with tesseract and the orange color of the scanned document seems to hinder the process producing errors in the text, I have tried removing the orange color I with photoshop, making the OCR and works perfectly, the main problem is that the pixels are not all of the same color, they are orange but in different shades

Bitmap modificar = new Bitmap("imagenamodificar.png");
        for (int ycount2 = 0; ycount2 < modificar.Height; ycount2++)
        {
            for (int xcount2 = 0; xcount2 < modificar.Width; xcount2++)
            {
                if (modificar.GetPixel(xcount2, ycount2) == Color.Orange)
                {
                    modificar.SetPixel(xcount2, ycount2, Color.White);
                }
            }
        }

This code does absolutely nothing, the image remains identical.

Then it occurs to me to compare with the pixel (0,0) since it is always the color I want to eliminate.

Bitmap modificar = new Bitmap("imagenamodificar.png");
        for (int ycount2 = 0; ycount2 < modificar.Height; ycount2++)
        {
            for (int xcount2 = 1; xcount2 < modificar.Width; xcount2++)
            {
                if (modificar.GetPixel(xcount2, ycount2) == modificar.GetPixel(0,0))
                {
                    modificar.SetPixel(xcount2, ycount2, Color.White);
                }
            }
        }

But the problem is that it only removes a small part, orange pixels remain because as I mentioned before, not all orange tones are the same, can someone think of something?

  • 1
    1) named colors can only be compared to getpixel colors by toARGB. b) something like getHue allows matching hue with an epsilon. c) using lockbits will allow speed. d) so does a color matrix. e)See [here](https://stackoverflow.com/questions/27374550/how-to-compare-color-object-and-get-closest-color-in-an-color/27375621#27375621) for examples of color matching. – TaW Mar 31 '18 at 03:23
  • @TaW I am still something new programming and I can not understand what the code does in the example could you help me? – Cristian Gerani Mar 31 '18 at 03:34
  • @Cristian it appears to convert the colour to HSB colour space in order to compare the hue. – ProgrammingLlama Mar 31 '18 at 03:40
  • It has __3 different__ ways to compare colors. You should try each to see which coomes closest to your needs. Two of the functions use the Color.GetHue function which is built-in. But the hue of orange is the same as the hue of many browns; so if you want to catch only a range of orange hues the 3rd function (`closestColor3` or actually just `ColorNum`) would be best.. - – TaW Mar 31 '18 at 07:33
  • Don't worry about the code at the end; it just creates the color chart.. - The functions will sort a list by closeness to a target color. Using just the `ColorNum` functi9on will give you a distance number you can use directly to compare to an epsilon you define.. – TaW Mar 31 '18 at 07:37
  • 1
    @taw i've updated my example i didn't understand the constants he was using. so i couldn't compare the results. it would be interesting to see difference on a color wheel though – TheGeneral Apr 01 '18 at 03:01

1 Answers1

5

Here are some key points to help you along your way

  1. Don't use GetPixel SetPixel, its extremely slow
  2. To help with speed its probably best to use unsafe with pointer access and call lockbits to get a Pinned Array
  3. You probably want to use a Threshold to figure out if a particular pixel color is close to the one you want to remove

A simple color threshold can be calculated by the following (you can also calculate this on Hue)

Given

  • threshold is some int
  • a source color
  • a pixel color

Threshold

var thresh = threshold * threshold;

// decode the RBG from the image Pointer
var r = ((*p >> 16) & 255) - sR;
var g = ((*p >> 8) & 255) - sG;
var b = ((*p >> 0) & 255) - sB;

// compare it against the threshold
if (r * r + g * g + b * b > thresh)
   continue;

Note : The link given in the comments by TaW is extremely helpful at figuring out color distance.

Use lockbits to get access to the Scanlines and Pin our memory

Bitmap.LockBits Method (Rectangle, ImageLockMode, PixelFormat)

Locks a Bitmap into system memory.

Code

private static unsafe void ConvertImage(string fromPath, string toPath, Color source, Color targetColor, double threshold)
{
   var thresh = threshold * threshold;
   var target = targetColor.ToArgb();

   using (var bmp = new Bitmap(fromPath))
   {   
      // lock the array for direct access
      var data = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height), ImageLockMode.ReadWrite, PixelFormat.Format32bppPArgb);
      // Convert the source to rgb
      int sR = source.R, sG = source.G, sB = source.B;
      // store the max length so we don't have to recalculate it
      var length = (int*)data.Scan0 + bmp.Height * bmp.Width;

      for (var p = (int*)data.Scan0; p < length; p++)           
      {

         // get the rgb Distance
         var r = ((*p >> 16) & 255) - sR;
         var g = ((*p >> 8) & 255) - sG;
         var b = ((*p >> 0) & 255) - sB;

         // compare it against the threshold
         if (r * r + g * g + b * b > thresh)
            continue;
         // poke the target color in
         *p = target;
      }

      // unlock the bitmap
      bmp.UnlockBits(data);
      bmp.Save(toPath);
   }
}

Usage

ConvertImage(@"d:\test.jpg", @"D:\result.bmp", Color.FromArgb(247, 107, 1), Color.Black, 25);

Note : i'm using a jpg color wheel so its not as clean as it could be


Original image

enter image description here

Threshold 25

enter image description here

Threshold 75

enter image description here

Threshold 150

enter image description here

Orange Test threshold 75

enter image description here

enter image description here


unsafe (C# Reference)

The unsafe keyword denotes an unsafe context, which is required for any operation involving pointers

Unsafe Code and Pointers (C# Programming Guide)

In the common language runtime (CLR), unsafe code is referred to as unverifiable code. Unsafe code in C# is not necessarily dangerous; it is just code whose safety cannot be verified by the CLR. The CLR will therefore only execute unsafe code if it is in a fully trusted assembly. If you use unsafe code, it is your responsibility to ensure that your code does not introduce security risks or pointer errors.

TheGeneral
  • 79,002
  • 9
  • 103
  • 141