2

i have a Camel process which pulls file from Azure container and process it in Azure environment.

I expect the process to run continuously, but it shuts down after random interval.

Logs:

CamelHangupInterceptor: INFO (MainSupport.java:87) - Received hang up - stopping the main instance.
CamelHangupInterceptor: DEBUG (Main.java:187) - Stopping Spring ApplicationContext: org.springframework.context.support.ClassPathXmlApplicationContext@47ef968d
2019-01-29 21:39:50,782: main: INFO (MainSupport.java:502) - MainSupport exiting code: 0

... closes all the routes after inflight exchange is compeleted since using DefaultShutdownStrategy.

Spring context route:

- Start with a scheduler for initial delay,
- then <dealy> component to randomly generate time (logic used for scalability to avoid race condition)
- invoking the custom class implementing Process class, which has Azure container url with credentials and fetch the file from container
- then using wireTap component to downloading the file
- finally invoking another class implementing Process class.

The Camel (v 2.20) process starts and executes as expected, but after a random interval process shuts down.

I see hangup signal received, but not sure how it happens. Logs shows graceful shutdown. Is there a way to send hangup signal to Camel process from external process?

in one of the route i am using to stop the exchange to stop the route forcibly.

 exchange.setProperty(Exchange.ROUTE_STOP, Boolean.TRUE);

More logs:

 2019-01-29 21:40:17,838: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultExecutorServiceManager.java:363) - Forcing shutdown of ExecutorService: org.apache.camel.util.concurrent.RejectableThreadPoolExecutor@647b9364[Running, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0][WireTap]
    ...
        2019-01-29 21:40:17,840: Camel Thread #0 - CamelHangupInterceptor: DEBUG (RouteService.java:289) - Shutting down services on route: route1
        2019-01-29 21:40:17,841: Camel Thread #0 - CamelHangupInterceptor: DEBUG (BeanComponent.java:72) - Clearing BeanInfo cache[size=1, hits=1, misses=1, evicted=0]
        2019-01-29 21:40:17,849: Camel Thread #0 - CamelHangupInterceptor: DEBUG (SimpleLanguage.java:136) - Clearing simple language predicate cache[size=0, hits=0, misses=0, evicted=0]
        2019-01-29 21:40:17,849: Camel Thread #0 - CamelHangupInterceptor: DEBUG (SimpleLanguage.java:142) - Clearing simple language expression cache[size=4, hits=1, misses=4, evicted=0]
        2019-01-29 21:40:17,849: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultManagementAgent.java:358) - Unregistered MBean with ObjectName: org.apache.camel:context=camelAzureBlobContext,type=routecontroller,name="camelAzureBlobContext"
        2019-01-29 21:40:17,849: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultManagementAgent.java:358) - Unregistered MBean with ObjectName: org.apache.camel:context=camelAzureBlobContext,type=health,name="camelAzureBlobContext"
        2019-01-29 21:40:17,850: Camel Thread #0 - CamelHangupInterceptor: DEBUG (TimerListenerManager.java:128) - Removed TimerListener: org.apache.camel.management.mbean.ManagedCamelContext@473692b
        2019-01-29 21:40:17,850: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultManagementAgent.java:358) - Unregistered MBean with ObjectName: org.apache.camel:context=camelAzureBlobContext,type=context,name="camelAzureBlobContext"
        2019-01-29 21:40:17,850: Camel Thread #0 - CamelHangupInterceptor: INFO (MainLifecycleStrategy.java:44) - CamelContext: camelAzureBlobContext has been shutdown, triggering shutdown of the JVM.
        2019-01-29 21:40:17,850: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultExecutorServiceManager.java:363) - Forcing shutdown of ExecutorService: org.apache.camel.util.concurrent.RejectableThreadPoolExecutor@72b68833[Running, pool size = 1, active threads = 0, queued tasks = 0, completed tasks = 1][ShutdownTask]
        2019-01-29 21:40:17,851: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultInflightRepository.java:183) - Shutting down with no inflight exchanges.
        2019-01-29 21:40:17,851: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultServicePool.java:110) - Stopping service pool: org.apache.camel.impl.SharedPollingConsumerServicePool@13ed066e
        2019-01-29 21:40:17,851: Camel Thread #0 - CamelHangupInterceptor: DEBUG (DefaultServicePool.java:110) - Stopping service pool: org.apache.camel.impl.SharedProducerServicePool@151ab2b9
    ...
    2019-01-29 21:40:17,856: Camel Thread #0 - CamelHangupInterceptor: DEBUG (MBeanInfoAssembler.java:79) - Clearing cache[size=30, hits=12, misses=30, evicted=0]
    2019-01-29 21:40:17,864: Camel Thread #0 - CamelHangupInterceptor: DEBUG (IntrospectionSupport.java:134) - Clearing cache[size=93, hits=192, misses=93, evicted=0]
    2019-01-29 21:40:17,869: Camel Thread #0 - CamelHangupInterceptor: INFO (DefaultCamelContext.java:3575) - Apache Camel 2.20.0 (CamelContext: camelAzureBlobContext) uptime 32 minutes

-- ROUTE INFORMATION--

<camel:camelContext id="camelAzureBlobContext" xmlns="http://camel.apache.org/schema/spring" typeConverterStatisticsEnabled="true" autoStartup="true">
    <endpoint id="listBlobendpoint"
        uri="azure-blob://storageaccount/containerName?credentials=containercredientiasl&amp;operation=listBlobs" /> <!-- changed the actual values -->

    <dataFormats>
        <json id="inputMsg" library="Jackson" unmarshalTypeName="pacakage.requiredinputpojojacksonclass" />  <!-- renamed the class name-->
    </dataFormats>
    <onException>
        <exception>com.fasterxml.jackson.core.JsonParseException</exception>
        <exception>com.fasterxml.jackson.databind.JsonMappingException</exception>
        <handled>
            <constant>true</constant>
        </handled>
        <process ref="parseExceptionResponse" />
    </onException>
    <route>
        <from uri="scheduler://tempScheduler?initialDelay=5000&amp;delay=50000" /> <!-- changed the actual values -->
        <setHeader headerName="BlobListingDetails">
            <simple resultType="com.microsoft.azure.storage.blob.BlobListingDetails">METADATA</simple>
        </setHeader>
        <delay>
            <method ref="blobcamelprocess" method="randomDelayToPoll"></method> <!-- Method which has some random number generation-->
        </delay>
        <to uri="ref:listBlobendpoint" /> <!-- bean which actually sets the metadata value-->
        <process ref="blobcamelprocess" />
        <!-- creating recipient list to update metadata of the container blob file -->
        <recipientList>
            <header>update_metadata</header>
        </recipientList>
        <log message="Message | $simple{in.header[filename]}" loggingLevel="INFO"></log>
        <wireTap uri="file://location?fileName=$simple{in.header[filename]}"/>
        <unmarshal ref="inputMsg" />
        <process ref="messageconversionprocess" /> <!-- bean which actually converts uses the parsed json to construct java object-->
        <process ref="deleteblobProcess" /> <!-- bean that will be used to delete the file from the blob store -->
        <recipientList>
            <header>delete_blob</header> <!-- endpoint details is set from the above bean and passed here -->
        </recipientList>
    </route>    
</camel:camelContext>
Tim
  • 1,321
  • 1
  • 22
  • 47

1 Answers1

1

As per the logs, your whole Camel context is shutting down. Its quite unlikely that the context shutdown is initiated by your ROUTE_STOP call.

Looking at MainSupport.java source, the INFO log that you see in your log files is executed as a part of hang up notification hook, notified by a JVM Shutdown in progress. When this happens, does your JVM die as well? This documentation says hooks are notified when a JVM Abnormal termination is initiated by SIGINT, SIGTERM or SIGHUP

Perhaps we can look into further details if you could provide answers to the following

  • Is Camel running in a web container or something else? I see MainSupport.java :)
  • What JVM version do you use?
  • What OS platform is it running on?
  • When Camel context shuts down, is the JVM still running or JVM exits too?
  • Do you see additional exceptions in the log files?
  • Can we correlate this CamelContext shutdown to a system resource exhaustion event? (like running short of memory or something like that). Quite unlikely to be the OOMKiller since its is a graceful shutdown.

On a side note, don't be misled by the name CamelHangupInterceptor it doesn't mean Camel received a SIGHUP.

ShellDragon
  • 1,712
  • 2
  • 12
  • 24
  • JRE (SE) version 1.8.0_131 (hotspot build 25.131-b11) 64 bit Camel process is running on Azure Vm (centos 7) Yes Camel context shuts down the JVM too. – Tim Jan 30 '19 at 22:06
  • So there are no traces of OutOfMemoryException or anything of that sort. How much heap do you allocate to the `jvm`? How much memory does the Azure VM has? When ever this event happens, do you see anything related in the OS event logs (inspect with `journalctl` or equivalents)? On " Camel context shuts down the JVM too", I have a strong feeling its the other way around. Perhaps its worth posting your RouteBuilder(s) as well. – ShellDragon Jan 31 '19 at 05:27
  • Its quite far fetched but is there even a remote possibility that your application receives a JMX shutdown from somewhere? Try starting the camel context with `context.disableJMX()`. How does this application start on the Centos box? Is it starting up as a `systemd` service? – ShellDragon Jan 31 '19 at 05:41
  • I referred this post on Camel project's gitter channel so that someone with deeper knowledge of the framework can have a look too – ShellDragon Jan 31 '19 at 07:50
  • One of the project members(@zregvart) confirmed on gitter channel that your context shut down is indeed caused by a JVM shutdown, affirming my initial analysis. You need to find out what is killing your JVM. Perhaps there is some trace in your OS logs. [Here is some documentation](https://docs.oracle.com/javase/8/docs/api/java/lang/Runtime.html#addShutdownHook-java.lang.Thread-) on the shutdown hook. Hope this helps you get ahead. – ShellDragon Jan 31 '19 at 08:34
  • i will investigate the JVM shutdown hook. – Tim Jan 31 '19 at 18:20
  • All the best with further investigations. Let us know your findings :) – ShellDragon Feb 01 '19 at 08:03
  • i used strace command with the pid, i see a SIGTERM signal from some process, not sure where it is being sent. 21:30:03.658710 futex(0x7f5f352249d0, FUTEX_WAIT, 73894, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) 21:46:37.748158 --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=1117, si_uid=30001} --- 21:46:37.748296 futex(0x7f5f345e2700, FUTEX_WAKE_PRIVATE, 1) = 1 21:46:37.748516 rt_sigreturn({mask=[]}) = 202 21:46:37.748690 futex(0x7f5f352249d0, FUTEX_WAIT, 73894, NULL 21:46:38.191516 +++ exited with 143 +++ – Tim Feb 19 '19 at 00:05
  • How does this process start and run? Is it started as a daemon by systemd or similar? Are there any shut down mechanisms in other parts of the code that could send a SIGTERM? – ShellDragon Feb 19 '19 at 12:13
  • The information from your log, as `siginfo` struct may be useful. `si_pid=1117, si_uid=30001`. 1117 is the process of that sent the signal and it was running under the authority of user 30001. – ShellDragon Feb 19 '19 at 13:11
  • The camel process is invoked using shell script set up, a cron job will check to see if the process is running or not, if not the camel process will be invoked. The strace output has the pid which is not active, have to check the /var/log/messages file if the history of the pid is stored. Also this VM node is setup using chef, need to check if any chef configuration is sending this signal. – Tim Feb 19 '19 at 15:33
  • Good to know. Isn't it a better idea to offload process life cycle management to `systemd` than to `cron`? Well you may have reasons not to do it. If you are looking at systemd, [This is a good start](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/system_administrators_guide/sect-managing_services_with_systemd-unit_files) – ShellDragon Feb 19 '19 at 15:48
  • 1
    In my case the process was killed by OS, since the VM (centos 7.2) is provisioned using Chef, the chef client which running as daemon thread often kills the running process since stop shell was invoked if chef tag is missing. Few of the debug steps, i used $ strace -tt -o outfilename.out -p & $ service --status-all $ jstat -l $ jmap => analysed dump using eclipse heap analyzer – Tim Feb 26 '19 at 20:09
  • Kudos. Finally you've nailed it. – ShellDragon Feb 27 '19 at 07:12