1

I have a very peculiar problem that I am working on. I have code compiled by an old compiler (gcc 2.95 or older) on solaris 8/sparc platform. It runs fine on solaris 8/sparc but crashes on solaris 10/sparc. (solaris 10 is supposedly backward compatible with solaris 8)

On debugging, I see that the problem comes up when the app tries to convert a hostname into it's corresponding i/p address. It uses gethostbyname_r, followed by inet_ntoa to obtain the ipv4 quad dotted number. gdb'ing through the solution brings me to the point where I see that the in_addr returned by gethostbyname_r has the correct integer representing the i/p address, but inet_ntoa call returns a malformed string. One difficulty in confirming that it is really inet_ntoa failing is that the code is written like below

strcpy(hostaddr, inet_ntoa(*((struct in_addr *) hostdata.h_addr)));

So technically I cannot see the value returned by inet_ntoa. But I can do a

print (char*)inet_ntoa(*((struct in_addr *) hostdata.h_addr_list[0]))

on gdb to see (which is close enough I presume) and it prints malformed i/p addresses . For example, "0.0.". (The hostname has a valid i/p address and can be resolved from that machine so an i/p starting with 0.0. is also not the right value)

You can see that using the unsafe strcpy with inet_ntoa creates a bit of unknown, and causes a segmentation fault.

It would be great to hear from someone who has experienced something similar, as to what might be the cause of such a failure of inet_ntoa. Somehow the system is playing a role, and I cannot pinpoint it to even think about what can be done to fix that.

All comments will be greatly appreciated.

Constraints: I cannot modify the code to make it work (else this is trivial to solve). So despite knowing that strcpy is a very unsafe function w.r.t seg faults, and inet_ntoa is deprecated I am helpless on that front.

EDIT: I have a feeling that it is a parallel processing issue. I am not sure but I don't think the app is multi-threaded. But the new sol10 machine is a 64 core machine. The reason for the thought chain is that the only real issue with inet_ntoa is the static buffer and the code does make this call in a loop.

IDK
  • 178
  • 8
  • You wrote "I have code compiled by an old compiler (gcc 2.95 or older) on solaris 10/sparc platform.", I guess you really mean: "... on solaris 8/sparc platform". Otherwise, your code won't work on Solaris 8. – jlliagre Sep 12 '12 at 22:38

1 Answers1

0

I found out that the linker (and bad code of-course) was the issue. Some super nice person, though that it was a great idea to have a function by the exact same name as a standard library function (inet_ntoa_r). When I tried to use the -static option while linking my code with the libraries, it started complaining about the presence of this symbol in the user library file. Once I got rid of that function from the user library it moved on from that crash (to some other issue which I am trying to fix. Expect another question :) ). Hope someone find this useful

IDK
  • 178
  • 8