0

Environment
Oracle Solaris 11 for SPARC
Running in a Non-primary (Guest) Logical Domain (LDOM).
Logged in with root access.

Problem
My application uses libpcap to capture network traffic. When my application (myTestApp) calls libpcap findalldevs, my application only sees one network interface ("lo0"), yet ifconfig -a shows many more interfaces.

My application is statically linked to libpcap (version 1.3). The build machine is SunOS RS-T5120-01 5.10 Generic_141444-09 sun4v sparc SUNW,SPARC-Enterprise-T5120.

Any ideas why my application can't see all the network interfaces ?

Linux command Line Sample Output

# tcpdump --version

tcpdump version 4.1.1
libpcap version 1.1.1

# uname -a
SunOS g99dnpi802-LD 5.11 11.1 sun4v sparc sun4v


# ./myTestApp -adapters

[Available Adapters]
name: "lo0", description: "", address: 127.0.0.1, mask: 255.0.0.0


# ifconfig -a

lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
net0: flags=100001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,PHYSRUNNING> mtu 1500 index 2
        inet 10.99.220.15 netmask ffffff00 broadcast 10.99.220.255
        ether 0:14:4f:fa:e0:8d
net1: flags=100001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,PHYSRUNNING> mtu 1500 index 3
        inet 10.99.193.210 netmask ffffff80 broadcast 10.99.193.255
        ether 0:14:4f:f9:d0:9c
lo0: flags=2002000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv6,VIRTUAL> mtu 8252 index 1
        inet6 ::1/128
net0: flags=120002000840<RUNNING,MULTICAST,IPv6,PHYSRUNNING> mtu 1500 index 2
        inet6 ::/0
        ether 0:14:4f:fa:e0:8d
net1: flags=120002000840<RUNNING,MULTICAST,IPv6,PHYSRUNNING> mtu 1500 index 3
        inet6 ::/0
        ether 0:14:4f:f9:d0:9c

# tcpdump -i net1

    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on net1, link-type EN10MB (Ethernet), capture size 65535 bytes
    09:32:29.520815 IP g99dnpi802-LD.ssh > 10.99.8.102.65436: Flags [P.], seq 3397909586:3397909718, ack 1479093081, win 64240, length 132
    09:32:29.520860 IP g99dnpi802-LD.ssh > 10.99.8.102.65436: Flags [P.], seq 132:232, ack 1, win 64240, length 100
    09:32:29.521644 IP 10.99.8.102.65436 > g99dnpi802-LD.ssh: Flags [.], ack 132, win 16379, length 0
    09:32:29.680844 00:14:4f:f9:8d:84 (oui Unknown) > Broadcast, ethertype Unknown (0xcafe), length 90:
            0x0000:  0500 ad85 0939 ffff 0001 ffff 809c 7401  .....9........t.
            0x0010:  0000 004c 0000 0000 8070 00ab 0000 0000  ...L.....p......
            0x0020:  0000 0000 0000 0000 0043 ffff 2074 6167  .........C...tag
            0x0030:  6d61 7374 0672 0014 4ff9 8d84 5f31 3362  mast.r..O..._13b
            0x0040:  650a 0000 0000 0000 84f9 0aab            e...........  

[update]

Here is the (edited) output of running the following truss command on the build machine and the customer machine.

truss –f –a –vall –l –d –o truss.txt ./myTestApp -adapters

truss on build machine

14365/1:     0.0751 so_socket(PF_INET, SOCK_DGRAM, IPPROTO_IP, "", SOV_DEFAULT) = 3
14365/1:     0.0753 so_socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP, "", SOV_DEFAULT) = 4
14365/1:     0.0755 ioctl(3, SIOCGLIFNUM, 0xFFBE9F50)       = 0
14365/1:     0.0757 ioctl(3, SIOCGLIFCONF, 0xFFBE9F40)      = 0
14365/1:     0.0804 ioctl(3, SIOCGLIFFLAGS, 0xFFBE9DC8)     = 0
14365/1:     0.0806 ioctl(3, SIOCGLIFNETMASK, 0xFFBE9C50)       = 0
14365/1:     0.0809 open64("/dev/lo", O_RDWR)           Err#2 ENOENT
14365/1:     0.0811 open64("/dev/lo0", O_RDWR)          Err#2 ENOENT
14365/1:     0.0813 ioctl(3, SIOCGLIFFLAGS, 0xFFBE9DC8)     = 0
14365/1:     0.0815 ioctl(3, SIOCGLIFNETMASK, 0xFFBE9C50)       = 0
14365/1:     0.0817 ioctl(3, SIOCGLIFBRDADDR, 0xFFBE9AD8)       = 0
14365/1:     0.0819 open64("/dev/e1000g", O_RDWR)           = 5

truss on customer machine

6346/1:          0.0315 so_socket(PF_INET, SOCK_DGRAM, IPPROTO_IP, 0, SOV_DEFAULT) = 3
6346/1:          0.0319 so_socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP, 0, SOV_DEFAULT) = 5
6346/1:          0.0320 ioctl(3, SIOCGLIFNUM, 0xFFBEA830)               = 0
6346/1:          0.0321 ioctl(3, SIOCGLIFCONF, 0xFFBEA820)              = 0
6346/1:          0.0322 ioctl(3, SIOCGLIFFLAGS, 0xFFBEA6A8)             = 0
6346/1:          0.0323 ioctl(3, SIOCGLIFNETMASK, 0xFFBEA530)           = 0
6346/1:          0.0327 open64("/dev/lo", O_RDWR)                       Err#2 ENOENT
6346/1:          0.0328 open64("/dev/lo0", O_RDWR)                      = 6
6346/1:          0.0345 ioctl(3, SIOCGLIFFLAGS, 0xFFBEA6A8)             = 0
6346/1:          0.0346 ioctl(3, SIOCGLIFNETMASK, 0xFFBEA530)           = 0
6346/1:          0.0347 ioctl(3, SIOCGLIFBRDADDR, 0xFFBEA3B8)           = 0
6346/1:          0.0347 open64("/dev/net", O_RDWR)                      Err#21 EISDIR
6346/1:          0.0349 ioctl(3, SIOCGLIFFLAGS, 0xFFBEA6A8)             = 0
6346/1:          0.0349 ioctl(3, SIOCGLIFNETMASK, 0xFFBEA530)           = 0
6346/1:          0.0350 ioctl(3, SIOCGLIFBRDADDR, 0xFFBEA3B8)           = 0
6346/1:          0.0351 open64("/dev/net", O_RDWR)                      Err#21 EISDIR
reggie
  • 33
  • 1
  • 3
  • What happens if you download the latest version of libpcap, try building it, doing "make tests", and running the "findalldevstest" program in the libpcap tree? –  Mar 16 '16 at 18:28
  • Unfortunately, this is a customer's machine that we can't touch. I updated the question to show that tcpdump is able to capture on net1 interface. – reggie Mar 16 '16 at 20:48
  • Where are you building your application? Does it show all devices there? – Andrew Henle Mar 16 '16 at 20:56
  • Updated question with info about build machine. On the build machine, "ifconfig -a" shows interfaces lo0 (which is a LOOPBACK) and e1000g0. My app shows e1000g0. – reggie Mar 16 '16 at 21:36
  • "My application" ... "Unfortunately, this is a customer's machine that we can't touch." Apparently you're able to touch it to the extent of being able to put your application on it. Are you not able to touch it to the extent of being able to download a tarball, unpack it, run "configure", run "make", run "make tests", and run "./findalldevstest"? (Note that I did not say "make install" anywhere.) –  Mar 17 '16 at 06:13
  • @GuyHarris Building on a customer machine that differs from development and test machines produces a binary that is different from one that went through an actual development and test cycle. That's one reason why Linux installs tend to be less reliable than Solaris installs. Nevermind the fact that a customer *LDOM* probably doesn't have the tools required in the first place. – Andrew Henle Mar 17 '16 at 11:05
  • @reggie Run your process under truss on each machine: `truss -f -a -vall -l -d -o /path/to/output/file yourApp [your args]`. Look for differences in how your application receives data. I'd guess you're looking for `ioctl()` calls on a file descriptor to `/dev/e1000g` or something similar (or lack thereof on your customer's machine). You may also glean useful data trying to determine what's different between the machines from digging into the output from the `ndd`, `dladm` and `ipadm` utilities. – Andrew Henle Mar 17 '16 at 11:12
  • @AndrewHenle The `truss` command sounds interesting. I'll ask the customer to run it and send me the output. – reggie Mar 17 '16 at 14:15
  • So you can't even do the download/unpack/configure/make/make tests cycle on another machine, copy the binary over, and run it? This isn't some production binary, it's something you run to do a *test* to try to *debug a problem*, so it's not as if there *needs* to be a development and test cycle for findalldevstest. –  Mar 17 '16 at 17:47
  • @GuyHarris You are absolutely correct. Getting the latest `libpcap` bits onto our dev machine and building it there to generate the `findalldevstest` binary and having the customer run that binary is a good idea. We'll first get the results from the customer running `truss`, and if _that_ fails to illuminate the issue, we'll go with the `findalldevstest` suggestion. – reggie Mar 17 '16 at 18:27

0 Answers0