2

I am using the Apache Camel File component to read from a local directory and upload to an AWS S3 bucket. This route has worked flawlessly in the past, but is having issues with certain files now-a-days.

In my debug investigations thus far (3 days of misery), I have found that the route reaches the DelegateSyncProcessor class and does not have any exception set on the exchange (for reference see the relevant code of the previously mentioned class below). Considering there aren't any exceptions, I can't seem to figure out why the onCompletion() methods never fire for specific files. There weren't/aren't any exceptions for problematic files; however, the file and the lock continue to exist after all the logic has executed (including the final transfer of the message to the .to() endpoint) hinting that there is some internal Camel issue. I suspect this because the route behaves normally in every aspect except the file and lock deletion.

After enabling debug logs for Camel, I was frustrated to see that there were no logs related to an error during the route execution. I would love to hear any advice about what could be happening under the hood.

A few extra things to note:

  • I am running camel 2.16.0
  • No easily detectable issues were seen while debugging through the camel code.
  • The logic in the .process() sections runs all the way through (some exceptions occurred, but they are properly handled.

Update:

More debugging with debug logs enabled for camel yielded no results. I have only found that the logic within my .process() code is the issue. When the code takes too long to execute (even without exceptions), the file deletion fails.


Update 2:

I found where the camel route actually falls apart. The CamelEventLogger.java class completely fails while trying to "logEvent". The related code to that class is added below. When the code reaches matcher.find(), it times out.

MyRouteClass.java:

from(importProcessingEndpoint)
        .convertBodyTo(byte[].class)
        .process((exchange) -> {
           // some logic here
        })
        .to(outgoingEndpoint)
        .threads(MAX_NUMBER_OF_CAMEL_THREADS)
        .process((exchange) -> {
           // some logic here
        })
        .log("Finished processing import file.")
        .to(outgoingEndpoint);

DelegateSyncProcessor.java:

@Override
public boolean process(Exchange exchange, AsyncCallback callback) {
    // force calling the sync method
    try {
        processor.process(exchange);
    } catch (Throwable e) {
        // must catch throwable so we catch all
        exchange.setException(e);
    } finally {
        // we are bridging a sync processor as async so callback with true
        callback.done(true);
    }
    return true;
}

CamelEventLogger.java

private void logEvent(final String label, final Exchange exchange, final Endpoint endpoint,
            final long elapsedTime, final boolean logTID) {
        Matcher matcher = PATTERN.matcher(extractMessage(exchange));
        if (matcher.find()) {
            //CHECKSTYLE:OFF
            String evtType = matcher.group(1);
            String evtName = matcher.group(2);
            String guid = matcher.group(3);
            String tid = matcher.group(4);
            //CHECKSTYLE:ON
            if (tid == null || !logTID) {
                tid = "";
            } else {
                tid = "intuit_tid=" + tid;
            }
            log.info(LOG_FORMAT, label, exchange.getExchangeId(), guid, evtName, evtType, endpoint, elapsedTime, tid);
        } else { // the message is not parseable, fall back to minimum log entry
            log.info("Event {} id={} {} elapsedTimeMs={}", label, exchange.getExchangeId(), endpoint, elapsedTime);
        }
    }
Chad Van De Hey
  • 2,716
  • 3
  • 29
  • 46
  • 1. Have you found anything suspect about the problematic files? are they particularly large, do they all share some characteristic the ok files do not have? and 2. Have you tried debugging into the DelegateSyncProcessor class itself? and 3. What version of camel are you running? – Roddy of the Frozen Peas Mar 25 '19 at 22:51
  • For S&G can you add an `onException()` handler, and break inside the processor if it ever hits? – Not a JD Mar 25 '19 at 22:54
  • @RoddyoftheFrozenPeas, I added more info to address your questions. I would also like to point out that I've stepped through with a debugger and didn't find anything relevant. – Chad Van De Hey Mar 25 '19 at 23:21
  • @NotaJD, I am not sure if I completely understand what you are trying to say. I tried to use `onException()` defined at the top of my route to see if any exception were being thrown and the debugger never broke (after I set a break point on the added onException code). This made me extra suspicious that there wasn't an exception at all (nothing logged and nothing discovered when debugging). – Chad Van De Hey Mar 25 '19 at 23:29
  • 2
    @ChadVanDeHey that's what I was getting at. I definitely feel your pain, Camel has been a cause of much unmerriment for me, also. This will sound ridiculous but it's worth a shot: when you remove the processors and just have it run through the route, does the file get moved and lock file deleted? – Not a JD Mar 25 '19 at 23:31
  • @NotaJD, my mind has exploded. After removing the `.process()` code, the route effectively deletes the lock and file. This helps, but does not point out an answer. As I explained above, the `.process()` code executes all the way through. This simply means that Camel doesn't like something that is occurring in that block, but it doesn't seem exception related (from a debugging POV). – Chad Van De Hey Mar 25 '19 at 23:43
  • Next thing to do is reintroduce the code line by line / block by block until it breaks! I think you'll have this solved in a jiffy :) – Not a JD Mar 25 '19 at 23:45
  • @NotaJD, good advice on the debugging technique. Painful it will be. Do you know of any potential limitations of Camel's processing time? The one pattern I have noticed is that files that take longer to process are more likely to fail. – Chad Van De Hey Mar 25 '19 at 23:49
  • @ChadVanDeHey my (limited) understanding is that if your route is synchronous, Camel won't abort it until it is done or an exception is thrown... Anecdotally the larger files might be causing an issue if, say, an OOME is thrown under the hood. Actually - I wonder if an Error is being thrown vs. an exception, and it's getting swallowed? – Not a JD Mar 25 '19 at 23:58
  • @NotaJD, have you ever seen Camel time out within the `logEvent()` method in the CamelEventLogger class? – Chad Van De Hey Mar 26 '19 at 23:10
  • @ChadVanDeHey I have not :( But, double check that class name... I don't have it locally: CamelEventLogger – Not a JD Mar 26 '19 at 23:34

1 Answers1

0

I was initially surprised that this wasn't a common problem with Camel, but after fixing it, I've come to realize that it is definitely a user error (albeit, an interesting one).

Having said that the problem is uncommon, Ill give my solution (hopefully it has some value to someone).

As mentioned above, my camel route has hanging and failing to delete the locking file and the message (which was a file in this case). FOR THE PROBLEMATIC FILE MESSAGES (I want to highlight this because it was an intermittent issue), the debugger seemed to bring me down a road to where the thread started to hang. Here I realized that I was brought to a custom Camel Logging class. The logging class would fail based on different inputs.

How would my situation be helpful to anyone else? I'll describe a strategy for debugging when you get to situations like these.

If your camel route is behaving similarly, find a use case in which the route hangs, pause your debugger (after it has started hanging) and take a thread dump. It is within this thread dump and stage within the paused debugger where you can back track a bit. For me, I noticed that 10 or so frames/steps ago, I reached an odd class. I set a debug point there and could see that it was problematic.

Chad Van De Hey
  • 2,716
  • 3
  • 29
  • 46