3

So here's the deal: I have an ASP.Net API set up on Azure that consists of 2 parts: a scheduled job that fetches 35 images (Width: 2000px, Height: 1450px) from an external website and saves it to my server. The second part is an GET API that has an X and Y value as parameter. Now the thing that I want to do is the following: When the user enters the X - Y, I want the API to go through all 35 images and get the color of that specific point.

I currently have this:

string Data = "";


foreach (string FilePath in   Directory.GetFiles(HttpContext.Current.Server.MapPath("~/Content/Images/")))
{ 
     Bitmap ImageHolder = new Bitmap(FilePath);
     Color color = ImageHolder.GetPixel(PixelX, PixelY);

     string ColorString = color.R.ToString() + " " + color.G.ToString() + " " + color.B.ToString();
     string Time = Path.GetFileNameWithoutExtension(FilePath);

     Data = Data + Time + "|" + ColorString + " ";
}
return Data;

Response:

1015|0 0 0 1020|0 0 0 1025|0 0 0 1030|0 0 0 1035|0 0 0 1040|0 0 0 1045|0 0 0 1050|0 0 0 1055|0 0 0 1100|0 0 0 1105|0 0 0 1110|0 0 0 1115|0 0 0 1120|0 0 0 1125|0 0 0 1130|0 0 0 1135|0 0 0 1140|0 0 0 1145|0 0 0 1150|0 0 0 1155|0 0 0 1200|0 0 0 1205|0 0 0 1210|0 0 0 1215|0 0 0 1220|0 0 0 1225|0 0 0 1230|0 0 0 1235|0 0 0 1240|0 0 0 1245|0 0 0 1250|0 0 0

Now, this all works. But, when I host this and use it as an API, the response time is sometimes over 6000 miliseconds.

By using the StopWatch, I see that the all the code runs fine, but that it sometimes takes up to 500 miliseconds when the code runs:

Bitmap ImageHolder = new Bitmap(FilePath);

Any ideas on how to decrease the response time and speeding the process up? Pre-calculating everything in the scheduled crone-job seems to be an effort as the images are quite big, and when storing them we would talk about a lot of points.

Liam
  • 27,717
  • 28
  • 128
  • 190
Niels
  • 2,496
  • 5
  • 24
  • 38
  • Partly off topic but if this is the `Bitmap` class that implements `IDisposable` you should wrap it in a using block. (I'm wondering if your timing is just when the objects are getting disposed) – Sayse Jul 16 '15 at 10:33
  • How big are the Bitmap objects? How often do they change? How often is this code called? Could you add some kind of cache to hold them in memory to save you loading them off the disk everytime? – Liam Jul 16 '15 at 10:33
  • In addition to the linked duplicate which covers faster ways to read a pixel from a bitmap than using `GetPixel`, this sounds like a perfectly parallelizable task to me! ie, get the pixel from each image in parallel, then build up the string (protip, dont use string concatenation, use a `StringBuilder` or `String.Join`) – Jamiec Jul 16 '15 at 10:43
  • 2
    Suggest put Bitmap in memory cache first time, then reuse it from the cache, see MemoryCache (http://stackoverflow.com/questions/7599732/caching-in-a-console-application, http://coders-corner.net/2013/05/18/memorycache/). public Bitmap GetImage(string fileName) { if (Cache[fileName] != null) { return (Bitmap)(Cache[fileName]); } Bitmap image = new Bitmap(fileName); Cache[cacheKey].Insert(image); retur image; } Also don't use Data = Data + "some string", use StringBuilder instead. – nikolai.serdiuk Jul 16 '15 at 10:44
  • This is **not** a duplicate of http://stackoverflow.com/questions/24701703/c-sharp-faster-alternatives-to-setpixel-and-getpixel-for-bitmaps-for-windows-f?lq=1 . The other question (and answers) only deal with increasing performance of multiple accesses to data on the same bitmap - the question here is how to speedup reading one, and only one pixel out of each bitmap. – Luaan Jul 16 '15 at 11:53
  • I'll try out the suggestions. @nikolai.serdiuk - but with every request the cache will be cleared right? I mean, user A would like to have the colors for Point(233, 456) and user B for Point(1569, 652)? – Niels Jul 16 '15 at 12:08
  • @ Jamiec - sounds promising, any examples on how to set this up? – Niels Jul 16 '15 at 12:10
  • Have you tried running this locally to see if it's an Azure I/O issue? The sample PNG you posted is only 131KB, so I don't see any reason that it'd take 0.5s to load. – Mark Brackett Jul 17 '15 at 18:11

2 Answers2

4

It seems that the files you're dealing with are simple 24-bit bitmaps, right?

In that case, you could avoid using GDI+ alltogether (it's a bad idea to use it in ASP.NET anyway), and just parse the bitmap data directly. This will mean you don't even have to read the whole file - just the header, and whatever pixel you need.

If you're indeed working with simple 24-bit bitmaps, the following should work just fine:

Color GetPixel(string fileName, int x, int y)
{
    var buffer = new byte[32];

    using (var file = File.OpenRead(fileName))
    {
        if (file.Read(buffer, 0, 32) < 32) return Color.Empty;

        // Bitmap type. Pretty much everything you find is BM.
        var type = Encoding.ASCII.GetString(buffer, 0, 2);
        if (type != "BM") return Color.Empty;

        // Data offset
        var offset = BitConverter.ToInt32(buffer, 10);

        // Windows bitmaps have width and height in a fixed place:
        var width = BitConverter.ToInt32(buffer, 18);
        var height = BitConverter.ToInt32(buffer, 22);

        if (width < x || height < y) return Color.Empty;

        // Three bytes per pixel, padded to multiples of four
        var rowSize = width * 3 + ((4 - ((width * 3) % 4)) % 4);

        // And get our pixel - since we're non-compressed, non-indexed, 
        // 32-bit pixels, this is easy. Note that bitmaps are usually stored
        // top to bottom:
        file.Seek(offset + ((rowSize * (height - y - 1)) + x * 3), SeekOrigin.Begin);

        if (file.Read(buffer, 0, 3) < 3) return Color.Empty;
        // Alpha
        buffer[3] = 0xFF;

        var color = BitConverter.ToInt32(buffer, 0);
        return Color.FromArgb(color);
    }
}

Even if your files aren't simple 24-bitmaps, that cron-job of yours shouldn't have a problem converting them - if you want easy indexing into the bitmap data, you don't have much of a choice anyway :)

It needs just a few changes to support 8-bit indexed bitmaps:

Color GetPixel(string fileName, int x, int y)
{
    var buffer = new byte[32];

    using (var file = File.OpenRead(fileName))
    {
        if (file.Read(buffer, 0, 32) < 32) return Color.Empty;

        // Bitmap type. Pretty much everything you find is BM.
        var type = Encoding.ASCII.GetString(buffer, 0, 2);
        if (type != "BM") return Color.Empty;

        // Data offset
        var offset = BitConverter.ToInt32(buffer, 10);

        // Windows bitmaps have width and height in a fixed place:
        var width = BitConverter.ToInt32(buffer, 18);
        var height = BitConverter.ToInt32(buffer, 22);

        if (width < x || height < y) return Color.Empty;

        // One byte per pixel, padded to multiples of four
        var rowSize = width + ((4 - ((width) % 4)) % 4);

        // Now we're going to read an index into our palette
        file.Seek(offset + ((rowSize * (height - y - 1)) + x), SeekOrigin.Begin);
        if (file.Read(buffer, 0, 1) < 1) return Color.Empty;

        // Jump to the palette record and get the actual color
        file.Seek(54 + buffer[0] * 4, SeekOrigin.Begin);
        if (file.Read(buffer, 0, 4) < 4) return Color.Empty;

        var color = BitConverter.ToInt32(buffer, 0);
        return Color.FromArgb(color);
    }
}

If you want to avoid writing your bitmap parsing code, you'll just have to make sure the bitmaps are always loaded and parsed in memory - the bottleneck isn't the GetPixel, it's loading the Bitmap from disk.

It might be worth it to cache the headers and the palette of each of the images, to avoid some of the seeking - I'm not sure if it's going to help at all, but the basic idea is like this:

private static readonly ConcurrentDictionary<string, byte[]> _cache;

Color GetPixel(string fileName, int x, int y)
{
  var buffer = new byte[3];

  using (var file = File.OpenRead(fileName))
  {
    byte[] headers;
    if (_cache.ContainsKey(fileName))
    {
      headers = _cache[fileName];
    }
    else
    {
      headers = new byte[1078];
      if (file.Read(headers, 0, headers.Length) < headers.Length) return Color.Empty;

      _cache.TryAdd(fileName, headers);
    }

    // Now read the headers as before, using the headers local instead of buffer
    // ...

    file.Seek(offset + ((rowSize * (height - y - 1)) + x), SeekOrigin.Begin);

    if (file.Read(buffer, 0, 1) < 1) return Color.Empty;

    var color = BitConverter.ToInt32(headers, 54 + buffer[0] * 4);

    return Color.FromArgb(color);
  }
}
Luaan
  • 62,244
  • 7
  • 97
  • 116
  • I think I'm using a 8-bit image. What would that mean for your code? – Niels Jul 16 '15 at 13:56
  • 1
    @Niels Well, if you want to keep it 8-bit, you'll need to read the palette as well (assuming it's an indexed image), and after finding the pixel value the same way as here (except that each pixel only takes a single byte instead of three), you'll need to find the correct color based on the palette. You can check what kind of bitmap you're dealing with by looking at the raw file - a 2-byte value on offset 28 will tell you the bitness (1, 8, 24, 32... that kind), and offset 46 will tell you how many records the palette has. The palette is right after the header (offset 54), each record 4 bytes. – Luaan Jul 16 '15 at 14:27
  • Thanks! I checked, and indeed - it is an 8 bit image. As the colors are quite specific, and also to overcome more traffic/converting I'd like to alter your code to support 8 bit. I changed the buffer[3] to buffer[1] as you mentioned, but that didn't seem to work. Could you edit the code above to show how it would be applied to 8 bit images? Btw, massive improvements on the performance :D :D. – Niels Jul 16 '15 at 19:38
  • @Niels Updated with 8-bit indexed version. I didn't test this one as thoroughly, but it should work just fine (I've tested it with the usual parrot test image :)). Note that it needs an extra seek - it *might* be faster to read the whole palette right after the header instead. But it should still be much faster than using GDI+ and reading all of the file :) – Luaan Jul 16 '15 at 20:25
  • Thanks!!! I still only get empty colors, looks like something goes wrong here: if (type != "BM") return Color.Empty; Type always returns as "?P" instead of "BM". Any ideas? – Niels Jul 16 '15 at 20:36
  • @Niels That doesn't look familiar. Just convert the files using irfanview or something like that. – Luaan Jul 16 '15 at 20:47
  • Yeah, doesn't ring a bell.. I have uploaded an image here: http://1drv.ms/1O9lrG5 (The black is transparent actually). Any thoughts? – Niels Jul 16 '15 at 20:50
  • @Niels Ah, right, it's a PNG. That's a bit more complicated and compressed - I think you can't use it to read the data directly the way you can with BMPs. I'd just convert to BMP - it's going to be bigger, but the only other option is to keep all the bitmaps in memory, which is going to be much worse :) If all those images are the same size, each should be around 3 MiB, so it shouldn't be too bad (that's what they'd take in memory as well - no point in wasting RAM when you've got HDD space). – Luaan Jul 17 '15 at 05:05
  • I have tried to convert everything to BMP, but images become 11 mb, and response time is still 7 or 7 seconds. :(. – Niels Jul 17 '15 at 11:44
  • @Niels It's 11 MiB because you saved it as 24-bit - if you save it as 8-bit, it will only be around a third of that. You said it's an Azure application - do you have the files stored *locally* on the web server? The performance might be problematic if the data is being fetched across a network... at the very least, you could store the image headers in RAM (that's just a few kiB) and only read the single pixel - not to mention that you could then easily request the pixel from each of the bitmaps at once. – Luaan Jul 17 '15 at 12:32
  • Yes, I have them stored in the 'Content' folder of the web app - so they are close. How would I achieve saving this into RAM? – Niels Jul 17 '15 at 12:56
  • @Niels I'm not sure what kind of layout Azure ASP.NET applications have - it's a cloud after all. Since the response time seems to be the same for reading the whole file and reading just a couple of bytes, it's probably dominated by I/O latency. You can try keeping the whole data up to `offset` in memory for each file - just make a cache per filename, and store a `byte[]` in there. – Luaan Jul 17 '15 at 13:35
  • I was just wondering, as I'm still struggling with this - do you have any working samples to achieve this? Turning up the server works with the response times, but the big amount of Memory is still a problem. Help would be appreciated so much :)! – Niels Aug 16 '15 at 19:46
  • @Niels Well, I'm not really convinced it's going to help much - I assume that Azure uses some smart caching technology that takes care of that on some level; the only thing it can't fix is the initial latency, which is kind of inherent in cloud systems. However, there might be some configuration / extra services that would improve this - after all, the file storage is designed for file *storage* - the latency doesn't hurt there one bit. Maybe you could try storing the images in a key-value database instead? – Luaan Aug 16 '15 at 21:57
  • Yeah, performance is one thing – but the exceptional memory usage is an even bigger issue now. It doesn’t seem logical now to load the entire file, and get one pixel with the Bitmap.GetPixel(). Saving them to read with a BinaryReader seems more approachable, or what you mentioned: You can try keeping the whole data up to offset in memory for each file - just make a cache per filename, and store a byte[] in there. Not sure how to do this, this server-side image processing is not really my thing . – Niels Aug 17 '15 at 11:45
  • @Niels I've updated the answer with the code to cache the headers. This cuts down the number of `Read`s per image to one after the initial load. It might help. – Luaan Aug 17 '15 at 12:58
2

Those are pretty big images and will take an appreciable time to read from disk, and a reasonable amount of RAM (which possibly might never be recovered during compaction - "large object heap fragmentation"). The internals of Bitmap.ctor(file) just pass down to a native GDI method - so your delay is entirely IO based.

The fastest possible way of achieving your goal would be to calculate the file based offset of the pixel based on your knowledge of the file format and open the file stream yourself, seek to the pixel, and read in only the required data. This is somewhat complex, but these file formats are very well documented and there may be an existing file library for them that you can use. Key thing here is not to parse the entire file into RAM.

PhillipH
  • 6,182
  • 1
  • 15
  • 25
  • I'm still struggling with the problem - do you have an example on how to do the thing you described? This is a sample image I'm trying to parse: https://onedrive.live.com/?id=E6862BEA2601E49F%21389461&cid=E6862BEA2601E49F&group=0&parId=root&authkey=%21AG_hYfB8nqJaoeI&o=OneUp I have around 10 requests per second and with that I'm getting lots of OutOfMemoryExceptions :(. – Niels Aug 16 '15 at 18:19
  • I looked briefly and found this example http://cplus.about.com/od/learnc/a/extract-data-from-image-in-csharp.htm which claims to have a Bitmap file reader within it, and gives an example of how to read bitmap pixel data from the file and writes it out to a CSV (just as an example). So you could just use this sample to read your Bitmap pixel. The key is it never reads the bitmap into RAM so should completely prevent OutOfMemory exceptions, and be very fast as well. – PhillipH Aug 18 '15 at 11:33