Find out the caller and the called service in a Linux machine

Question

I have a system which has multiple Java, C++ and Python services calling each other for getting various tasks done. It is a nearly 20 year old system that I have inherited.

Now, when I debug issues, I find it tough to identify which client is calling which service in which scenario. The problem is multiplied by the fact that this is a multi-layered system.

These services reside in the same Linux machine in test environment but connect to each other using IPs and ports (since that is the way the multi-machine production environment is setup).

Is there a way for me to detect which service is being called and by which client in each scenario by using some tool like sniffer? If so, can someone help me understand if there is any specific configuration that needs to be done?

P.S.: I can look through the logs to find out this information. But trolling through 15-20 service logs to find out these details and matching their various logging formats is not trivial. :(

Based on suggestions, I am adding more details: I have a use case where a user clicks on a button B1 and a Web service W1 is invoked. W1 might invoke one of the following services: a RESTful service R1, another RESTful service R2 or a SOAP service S1. Also, S1 can in turn invoke R1 in some of the use cases too.

Now, how do I find out which service was called and in which order?

1. Are you going to find out one or some specific services or all the services? 2. What kind of services are they, RESTful with HTTP, or some binary protocol? 3. What are you going to determine, just callers and callees, or everything including the service name, arguments, format, return values...? — nicky_zs, Jul 14 '14 at 02:13

nicky_zs · Answer 1 · 2014-07-14T16:25:47.777

I don't know if there is a "standard" way to do this. But if I'm in such situation, I will do the following:

First of all, find out which processes are listening TCP ports in the system:

netstat -tlnp
# -t: TCP -l: listening -n: numeric -p: pid

You can also find out what are they, including their arguments, by ps -ef.

Then, monitor the network of all the process:

tcpdump -i lo -Xn 'port 8888'
# suppose 8888 is the port on which some process is listening
# -i lo: since the services are on the same machine, all the traffics are through local loopback
# -X: ascii  -n: numeric

Then, click the button B1 and see on which services the network data arrives. So you can know that W1 calls these services.

Note that the behaviors may not be consistent when some technology is involved such as load balance, reverse proxy, etc.

It is also easy to find out which process is sending data to port 8888, because you can see the client's port in tcpdump. Say that port is 54321, then with:

netstat -tnp | grep 54321

you can find out which process is using the port 54321.

Of course, if your client is using a connection pool, it seems that the process will hold the port 54321 for quite a long time. However, if your client uses short connections to communicate with services, it is still OK to determine which process is using that port because TCP protocol ensures that after closing the port, the TCP client still holds that port for about a minute.

Find out the caller and the called service in a Linux machine

1 Answers1