2

After many hours of debugging and analysis, I have finally managed to isolate the cause of a race condition. Solving it is another matter!

To see the race condition in action, I recorded a video some way in to the debugging process. I have since furthered my understanding of the situation so please forgive the poor commentary and the silly mechanisms implemented as part of the debugging process.

http://screencast.com/t/aTAk1NOVanjR

So, the situation: we have a double buffered implementation of a surface (i.e. java.awt.Frame or Window) where there is an ongoing thread that essentially loops continuously, invoking the render process (which performs UI layout and renders it to the backbuffer) and then, post-render, blits the rendered area from backbuffer to screen.

Here's the pseudo-code version (full version line 824 of Surface.java) of the double buffered render:

public RenderedRegions render() {
    // pseudo code
    RenderedRegions r = super.render();
    if (r==null) // nothing rendered
        return
    for (region in r)
        establish max bounds
    blit(max bounds)
    return r;
}

As with any AWT surface implementation, it also implements (line 507 in AWT.java - link limit :( - use Surface.java link, replace core/Surface.java with plat/AWT.java) the paint/update overrides which also blit from the backbuffer to the screen:

        public void paint(Graphics gr) {
            Rectangle r = gr.getClipBounds();
            refreshFromBackbuffer(r.x - leftInset, r.y - topInset, r.width, r.height);
        }

Blitting is implemented (line 371 in AWT.java) using the drawImage() function:

    /** synchronized as otherwise it is possible to blit before images have been rendered to the backbuffer */
    public synchronized void blit(PixelBuffer s, int sx, int sy, int dx, int dy, int dx2, int dy2) {
        discoverInsets();
        try {
            window.getGraphics().drawImage(((AWTPixelBuffer)s).i,
                              dx + leftInset, dy + topInset,     // destination topleft corner
                              dx2 + leftInset, dy2 + topInset,   // destination bottomright corner
                              sx, sy,                            // source topleft corner
                              sx + (dx2 - dx), sy + (dy2 - dy),  // source bottomright corner
                              null);
        } catch (NullPointerException npe) { /* FIXME: handle this gracefully */ }
    }

(Warning: this is where I start making assumptions!)

The problem here seems to be that drawImage is asynchronous and that a blit from refreshBackBuffer() via paint/update is called first but occurs second.

So... blit is already synchronized. The obvious way of preventing the race condition doesn't work. :(

So far I have come up with two solutions, but neither of them are ideal:

  1. re-blit on the next render pass
    cons: performance hit, still get a bit of flicker due when encountering the race condition (valid screen -> invalid screen -> valid screen)

  2. do not blit on paint/update, instead set refresh bounds and use those bounds on next render pass
    cons: get black flicker when the screen is invalidated and the main application thread is catching up

Here (1) seems to be the lesser of two evils. Edit: and (2) doesn't work, getting blank screens... (1) works fine but is just masking the problem which is potentially still there.

What I'm hoping for, and seem unable to conjure up due to my weak understanding of synchronized and how to use it, is a locking mechanism that somehow accounts for the asynchronous nature of drawImage().

Or perhaps use ImageObserver?

Note that due to the nature of the application (Vexi, for those interested, website is out of date and I can only use 2 hyperlinks) the render thread must be outside of paint/update - it has a single-threaded script model and the layout process (a sub-process of render) invokes script.

Charles Goodwin
  • 6,402
  • 3
  • 34
  • 63

2 Answers2

1

Update: good approach here: AWT custom rendering - capture smooth resizes and eliminate resize flicker


The answer here was to remove all blitting from the paint() thread i.e. only ever refresh from the backbuffer in the program thread. This is the opposite to the answer as suggested by Jochen Bedersdorfer, but his answer was never going to work for us because the program has its own scripting model that is integrated with the layout model which drives rendering, thus it all has to happen sequentially.

(Speculation) Some of the problems stem from a less-than-stellar multiple monitor support in Java with accelerated graphics chipsets, as I ran in to this problem when adapting to use BufferStrategy, which was a direct3d+Java discrepancy.

Essentially paint() and update() are reduced to blocking calls. This works a lot better but has one drawback - there is no smooth resizing.

private class InnerFrame extends Frame() {
    public void update(Graphics g) { }
    public void paint(Graphics g) { }
    ....
}

I ended up using a buffer strategy although I'm not 100% satisfied with this approach as it seems to me that it is inefficient to be rendering to an image, then copying the full image to the BufferStrategy and then performing a show() to screen.

I also implemented a Swing-based alternative, but again I don't particularly like that. It uses a JLabel with an ImageIcon, whereby the program thread (not the EDT) draws to the Image that is wrapped by the ImageIcon.

I'm sure there's a follow up question for me to ask when I have more time to look into this with more purpose, but for now I have two working implementations that more or less address the initial woes as posted here - and I learnt a helluva lot discovering them.

Community
  • 1
  • 1
Charles Goodwin
  • 6,402
  • 3
  • 34
  • 63
0

Not sure, but what happens if you Blit in the AWT paint thread?

Jochen Bedersdorfer
  • 4,093
  • 24
  • 26
  • I assume you mean 'what' happens - the problem still happens. I just tried copying the blit code into the paint method and it behaves the same. (FWIW it made no real difference because paint already called blit indirectly via refreshFromBackBuffer, but for sake of thoroughness I tried what you suggested.) – Charles Goodwin Feb 10 '11 at 01:50
  • What puzzles me is that you say drawImage is asnychronous. It would only be asynch, if the code is not called in the AWT Event Handler thread (i.e. paint thread). And if paint indirectly calls Blit, it should pass along the Graphics object – Jochen Bedersdorfer Feb 10 '11 at 02:20
  • I am assuming drawImage is asynchronous (it definitely is in some situations, but not all? Not sure). I will modify paint() to pass on the graphics object and see what impact that has. The blit is occuring in 2 threads - the main application thread and the paint thread. Another assumption is that I am correctly diagnosing the problem. I could be wrong and my post to stackoverflow was less to get an answer and more to get insight into how to properly manage blitting from 2 separate threads. – Charles Goodwin Feb 10 '11 at 13:49
  • Using the Graphics supplied by paint made no difference. (Theoretically it's the same as frame.getGraphics() anyway.) Next up is a trial with ImageObserver. – Charles Goodwin Feb 10 '11 at 15:59
  • Not sure why the blit is occuring in two threads. It should do it only in the AWT event thread. You fill the back buffer in your 'animation' thread and leave the blitting onto the real graphics surface to the AWT event thread. You might want to synchronize filling the backbuffer/blitting to the real graphics context. – Jochen Bedersdorfer Feb 10 '11 at 16:08
  • So... I've spent a bit of time refactoring it so that all painting is done in the Event Dispatch Thread (i.e. via paint). The problem is still there! And still a race condition! It's getting to the point that I'm wondering if it is a Win7-64bit Java issue. I haven't given up all hope quite yet (hoping that I'm missing something or have made a daft error) but I'm not far off. ;-) – Charles Goodwin Feb 14 '11 at 03:22
  • Sorry to hear that. Maybe the race condition is someplace else.:( – Jochen Bedersdorfer Feb 14 '11 at 16:37
  • I will post a follow up question which is a bit more refined, see if it gets in front of the right pair of eyes. Thanks for your help! – Charles Goodwin Feb 16 '11 at 02:22