I surfed a lot of questions on the board, about tcp sockets, big-endian and little-endian format but to me nothing apllies to my case.
And I'm sorry for my bad English, I'm working on it :)
I'm loosing my mind on an unexpected behaviour in a simple client-server configuration. Here's the scenario:
Server (C++) <--- TCP socket ---> Client(Java).
Here's the client code:
package NetServ.apps.bigServer.NSLPClient;
import java.io.DataInputStream;
import java.io.DataOutputStream;
import java.io.IOException;
import java.net.Socket;
import java.net.SocketException;
import java.net.UnknownHostException;
import java.nio.charset.Charset;
import java.util.ArrayList;
public class Communicator {
private Socket sock;
private final int port = 6666;
private final String address="127.0.0.1";
private DataOutputStream out;
private DataInputStream in;
public Communicator(){
System.out.println("Creating communicator. Trying to bind to the tcp socket");
try {
sock = new Socket(address, port);
out=new DataOutputStream(sock.getOutputStream());
in=new DataInputStream(sock.getInputStream());
} catch (UnknownHostException e) {
System.out.println("Unable to resolv host");
e.printStackTrace();
} catch (IOException e) {
System.out.println("Generic I/O exception");
e.printStackTrace();
}
System.out.println("Communicator created");
}
public void sendRequest(Request req) throws IOException{
int cmd=0;
if(req.getCmd().equals(CommandType.tg_setup_message))
cmd=0;
if(req.getCmd().equals(CommandType.tg_remove_message))
cmd=1;
if(req.getCmd().equals(CommandType.tg_trigger_message))
cmd=2;
if(req.getCmd().equals(CommandType.tg_probe_message))
cmd=3;
byte[] buff;
Charset charset = Charset.forName("ISO-8859-1");
out.writeInt(cmd);
//out.writeUTF(req.getDstAddr().toString().substring(1));
buff = req.getDstAddr().toString().substring(1).getBytes(charset);
out.writeShort((short)buff.length);
out.write(buff, 0, buff.length);
out.writeInt(req.getProtocol());
out.writeInt(req.getSecure());
//out.writeUTF(req.getDataId());
buff = req.getDataId().getBytes(charset);
out.writeShort((short)buff.length);
out.write(buff, 0, buff.length);
//out.writeUTF(req.getUser());
buff = req.getUser().getBytes(charset);
out.writeShort((short)buff.length);
out.write(buff, 0, buff.length);
out.flush();
out.writeInt(req.getOffpath_type());
if(req.getOffpath_type()!=-1){
out.writeInt(req.getMetric_type());
String tmp = "" + req.getMetric();
//out.writeUTF(tmp);
buff = tmp.getBytes(charset);
out.writeShort((short)buff.length);
out.write(buff, 0, buff.length);
}
switch (req.getCmd()){
case tg_setup_message:
out.writeUTF(req.getUrl());
out.writeInt(req.getLifetime());
out.writeUTF(req.getParameters().toString());
break;
case tg_remove_message:
//TODO
break;
case tg_trigger_message:
//TODO
break;
case tg_probe_message:
for (Short s : req.getProbes()){
//System.out.println("Writing probe code " + s.shortValue());
out.writeShort(s.shortValue());
}
break;
}
if(req.getSignature()!=null){
out.writeInt(1);
out.writeUTF(req.getSignature());
}else{
out.writeInt(0);
}
if(req.getDep()!=null){
out.writeInt(1);
out.writeUTF(req.getDep());
}else{
out.writeInt(0);
}
if(req.getNotif()!=null){
out.writeInt(1);
out.writeUTF(req.getNotif());
}else{
out.writeInt(0);
}
if(req.getNode()!=null){
out.writeInt(1);
out.writeUTF(req.getNode());
}else{
out.writeInt(0);
}
out.flush();
//out.close();
System.out.println("request sent");
}
public ArrayList<String> rcvProbeResponse() throws IOException, SocketException{
ArrayList<String> response= new ArrayList<String>();
System.out.println("Waiting for response...");
boolean timeout=false;
int responseCode=-1;
responseCode=in.readInt();
//responseCode = in.readInt();
//System.out.println("Response code "+responseCode);
if(responseCode==1){ //response is ready! !
System.out.println("Response arriving from NSLP (code 1 )");
int responseCmdCode = in.readInt();
if(responseCmdCode!=2)
return null;
//System.out.println("Response Command Code " + responseCmdCode );
int probeSize = in.readInt();
//System.out.println("Number of probes " + probeSize);
for(int i=0; i<probeSize; i++){
//System.out.println("i: "+i);
String out = in.readUTF();
response.add(out);
}
}
in.close();
if(timeout)
return null;
else
return response;
}
}
Nothing special about that: the protocol between the entities is simply an exchange of integers, shorts and strings, that triggers the server to execute some signaling tasks (the server is the daemon of a signaling protocol).
On the other side the server is legacy code that I modified to comunicate with java. Here's the relevant code:
[...]
// Set the current socket
communicator->setSocket(sockfd);
// FSM data structure
NetservNslpFsmData * data = new NetservNslpFsmData();
//give the address list of this node to all FSMs created by the client
data->nodeAddressList = &(param.addresses);
// Read from socket the parameters and use them
int ret;
NetservNslpCommunicator::command cmd;
ret = communicator->recvCommandFromJava(&cmd);
if (ret <= 0) {
logSocketError(sockfd, "Command");
// free up the memory allocated
delete data;
return;
}
switch(cmd){
case NetservNslpCommunicator::tg_setup_message:
DLog(param.name, "cmd set: setup");
break;
case NetservNslpCommunicator::tg_remove_message:
DLog(param.name, "cmd set: remove");
break;
case NetservNslpCommunicator::tg_probe_message:
DLog(param.name, "cmd set: probe");
break;
case NetservNslpCommunicator::tg_trigger_message:
DLog(param.name, "cmd set: trigger");
break;
}
ret = communicator->recvIPFromJava(&(data->destAddr));
DLog(param.name, "Dst Address set: "<< data->destAddr.get_ip_str());
if (ret <= 0) {
logSocketError(sockfd, "Destination IP");
// free up the memory allocated
delete data;
return;
}
[...]
int reliable = communicator->recvIntFromJava();
data->reliability = (reliable == NetservNslpCommunicator::TCP);
DLog(param.name, "Reliability set : "<< data->reliability);
int secure = communicator->recvIntFromJava();
data->security = (secure == NetservNslpCommunicator::TCP);
DLog(param.name, "Security set : "<< data->security);
data->dataId = communicator->recvStringFromJava();
DLog(param.name, "DataId : "<< data->dataId);
if (data->dataId == NULL) {
logSocketError(sockfd, "dataId");
// free up the memory allocated
delete data;
return;
}
data->user = communicator->recvStringFromJava();
DLog(param.name, "User : "<< data->user);
if (data->user == NULL) {
logSocketError(sockfd, "user");
// free up the memory allocated
delete data;
return;
}
//Receiving OffPath parameters
data->offpath_type=communicator->recvIntFromJava();
DLog(param.name, "OffType : "<< data->offpath_type);
if(data->offpath_type != -1){
data->metric_type=communicator->recvIntFromJava();
DLog(param.name, "MetricType : "<< data->metric_type);
if(data->metric_type>3|| data->metric_type<1){
logSocketError(sockfd, "metric type");
// free up the memory allocated
delete data;
return;
}
char * tmpStr = communicator->recvStringFromJava();
if (tmpStr == NULL) {
logSocketError(sockfd, "metric");
// free up the memory allocated
delete data;
return;
}
data->metric = tmpStr;
DLog(param.name, "MetricValue : "<< data->metric);
DLog(param.name, "MetricLength : "<< data->metric.length());
}
// check if socket is still alive or some errors occured
if (!communicator->isAlive(sockfd)) {
logSocketError(sockfd, "Socket not alive!");
// free up the memory allocated
delete data;
return;
}
DLog(param.name,"Reading command-specific configuration");
switch(cmd)
{
case NetservNslpCommunicator::tg_setup_message:
data->urlList.push_back(communicator->recvString());
//check if the service data is exchanged together with signaling messages
if (data->urlList.front() != NULL && (strncmp(data->urlList.front(), "file://", 7) == 0))
data->data_included = true;
data->lifetime = communicator->recvIntFromJava();
data->setupParams = communicator->recvStringFromJava();
break;
case NetservNslpCommunicator::tg_remove_message:
break;
case NetservNslpCommunicator::tg_probe_message:
{
DLog(param.name, "Reading probe codes list.");
short probe = 0;
do {
probe = communicator->recvShortFromJava();
DLog(param.name,"Probe Code " << probe);
data->probes.push_back(probe);
} while (probe != 0);
data->probes.pop_back(); //delete the last 0
if (data->probes.empty()) {
logSocketError(sockfd, "Probe list is empty!");
return;
}
break;
}
case NetservNslpCommunicator::tg_trigger_message:
data->triggerType = communicator->recvInt();
switch (data->triggerType){
case NETSERV_MESSAGETYPE_SETUP:
data->urlList.push_back(communicator->recvString());
data->lifetime = communicator->recvInt();
data->setupParams = communicator->recvString();
break;
case NETSERV_MESSAGETYPE_REMOVE:
break;
case NETSERV_MESSAGETYPE_PROBE:
{
short probe = 0;
do {
probe = communicator->recvShortFromJava();
data->probes.push_back(probe);
} while (probe != 0);
data->probes.pop_back(); //delete the last 0
break;
}
default:
ERRLog(param.name, "Trigger type not supported");
closeSocket(sockfd);
return;
}
break;
default:
logSocketError(sockfd, "Trigger type not supported!");
return;
}
DLog(param.name,"Reading optional parameters.");
// Optional parameters passing
bool addParam = 0;
addParam = communicator->recvIntFromJava();
if (addParam) {
data->signature = communicator->recvStringFromJava();
if (data->signature == NULL) {
logSocketError(sockfd, "signature");
// free up the memory allocated
delete data;
return;
}
DLog(param.name, "Message signature : "<< data->signature);
}
addParam = communicator->recvIntFromJava();
if (addParam) {
data->depList.push_back(communicator->recvStringFromJava());
if (data->depList.front() == NULL) {
logSocketError(sockfd, "dependency list");
// free up the memory allocated
delete data;
return;
}
DLog(param.name, "Message dependency list : "<< data->depList.front());
}
addParam = communicator->recvIntFromJava();
if (addParam) {
data->notification = communicator->recvStringFromJava();
if (data->notification == NULL) {
logSocketError(sockfd, "notification");
// free up the memory allocated
delete data;
return;
}
DLog(param.name, "Message notification : "<< data->notification);
}
addParam = communicator->recvIntFromJava();
if (addParam) {
data->node = communicator->recvStringFromJava();
if (data->node == NULL) {
logSocketError(sockfd, "node");
// free up the memory allocated
delete data;
return;
}
DLog(param.name, "Node destination : "<< data->node);
}
[...]
The communicator wraps the socket and uses standard calls to write and read types:
int NetservNslpCommunicator::recvCommandFromJava(NetservNslpCommunicator::command * cmd){
int code = recvIntFromJava();
cout<<"received int "<<code<<endl;
if(code>=0){
switch(code){
case 0:
*cmd=NetservNslpCommunicator::tg_setup_message;
break;
case 1:
*cmd=NetservNslpCommunicator::tg_remove_message;
break;
case 2:
*cmd=NetservNslpCommunicator::tg_trigger_message;
break;
case 3:
*cmd=NetservNslpCommunicator::tg_probe_message;
break;
}
}
return code;
}
int NetservNslpCommunicator::recvIPFromJava(protlib::hostaddress * addr){
cout<<"receiving an IP"<<endl;
char* str = recvStringFromJava();
cout<<"String received "<< str << endl;
addr->set_ipv4(str);
return 1;
}
char * NetservNslpCommunicator::recvStringFromJava(){
short length = recvShortFromJava();
cout<< "receiving a string..."<<endl<<"String length "<<length<<endl;
char * string = new char[length];
int r = 0;
int orLength=length;
while(length)
{
int r = recv(sock, string, length, 0);
if(r <= 0)
break; // Socket closed or an error occurred
length -= r;
}
string[orLength]='\0';
if(orLength==0)
return NULL;
else
return string;
}
int NetservNslpCommunicator::recvIntFromJava(){
int x = 0;
recvBuffer(sock, &x, 4);
return x;
}
short NetservNslpCommunicator::recvShortFromJava()
{
short x = 0;
recvBuffer(sock, &x, 2);
return x;
}
int NetservNslpCommunicator::recvBuffer(int sock, void * buf, size_t size)
{
int counter = 0;
// Create a pollfd struct for use in the mainloop
struct pollfd poll_fd;
poll_fd.fd = sock;
poll_fd.events = POLLIN | POLLPRI;
poll_fd.revents = 0;
int r;
while (size && !stop)
{
/* Non-blocking behavior */
// wait on number_poll_sockets for the events specified above for sleep_time (in ms)
int poll_status = poll(&poll_fd, 1/*Number of poll socket*/, 100);
if (poll_fd.revents & POLLERR) // Error condition
{
if (errno != EINTR)
cout << "NetservNslpCommunicator : " << "Poll caused error " << strerror(errno) << " - indicated by revents" << endl;
else
cout << "NetservNslpCommunicator : " << "poll(): " << strerror(errno) << endl;
}
//ignore hangups when reading from a socket
if (poll_fd.revents & POLLHUP) // Hung up
{
cout << "NetservNslpCommunicator : " << "Poll hung up" << endl;
// return -1;
}
if (poll_fd.revents & POLLNVAL) // Invalid request: fd not open
{
cout << "NetservNslpCommunicator : " << "Poll Invalid request: fd not open" << endl;
return -1;
}
switch (poll_status)
{
case -1:
if (errno != EINTR)
cout << "NetservNslpCommunicator : " << "Poll status indicates error: " << strerror(errno) << endl;
else
cout << "NetservNslpCommunicator : " << "Poll status: " << str error(errno) << endl;
break;
case 0:
if (isTriggerTimerEnabled){
counter++;
if (counter == triggerTimerValue){
isTriggerTimerEnabled = false;
return -1;
}
}
continue;
break;
default:
r = recv(sock, buf, size, 0);
if (r <= 0)
{
if (r == 0) { // connection closed
r = -1; // return an error if socket closes
cout << "NetservNslpCommunicator : " << "No data received from socket!" << endl;
stop=true;
break;
}
if (r == -1 && errno == EINTR) // received interrupt during recv, continuing
continue;
if (r == -1 && errno != EINTR) // socket error, raise exception
break;
}
if (r != -1)
size -= r;
break;
}
}
counter = 0;
isTriggerTimerEnabled = false;
return r;
}
I ask you to focus only on the tg_probe_message part. The other messages are still to implement.
The strange behaviour is: the first time the client sends a request to the server everything goes well, all values are read perfectly. Hence the server answers sending back some integer and a sequences of strings. This is a trace (application layer only. One TCP packet per line) of what i capture on the socket:
00 //
00 //
00 //
03 // First integer
00 //
0a // Short representing string length
31:37:32:2e:31:36:2e:33:2e:32 //the string: "172.16.3.2"
00
00
00
01
00
00
00
00
00
1b
4e:65:74:53:65:72:76:2e:61:70:70:73:2e:4c:6f:69:62:46:61:6b:65:5f:30:2e:30:2e:31 //The string "NetServ.apps.LoibFake_0.0.1"
00
03
6a:61:65 //the string "jae"
00
00
00
03
00
00
00
01
00
01
31 //the string "1"
00
02
00
00
00 //
00 //
00 // 4 times
00 //
The server answers:
00:00:00:01 //response code
00:00:00:02 //response type
00:00:00:04 //number of strings to read
00:12 //str length
31:30:2e:31:30:2e:30:2e:35:20:41:43:54:49:56:45:20:31
00:12 //str length
31:30:2e:31:30:2e:30:2e:34:20:41:43:54:49:56:45:20:31
00:12 //str length
31:30:2e:31:30:2e:30:2e:33:20:41:43:54:49:56:45:20:32
00:12 //str length
31:30:2e:31:30:2e:30:2e:36:20:41:43:54:49:56:45:20:32
The second time the client sends a request (the same request) something weird occurs. This is what i captured with tcpdump during the second connection:
00 //
00 //
00 //
03 // First integer
00 //
0a // Short representing string length
31:37:32:2e:31:36:2e:33:2e:32 //the string: "172.16.3.2"
00
00
00
01
00
00
00
00
00:1b:4e:65:74:53:65:72:76:2e:61:70:70:73:2e:4c:6f:69:62:46:61:6b:65:5f:30:2e:30:2e:31:00:03:6a:61:65:00:00:00:03:00:00:00:01:00:01:31:00:02:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
With a little of patience you can recognise that the last packet contains ALL the bits of the request (same bits of the first request).
With some debuggin I can see that the command communicator->recvCommandFromJava(&cmd)
returns the number 50331648 (03:00:00:00) instead of 3 (00:00:00:03) and when the command communicator->recvIPFromJava(&(data->destAddr))
is executed, which in turn calls the recvStringFromJava()
, which uses the recvShortFromJava()
, the short representing the string length 00:0a (10) is swapped into the little-endian 0a:00 (2560). I believe this causes the tcp to put all the data available in the next packet and to spoil the subsequent calls.
As you can see from the code I didn't adopted conversion from host-order to net-order in the server (and that is because it works fine for the first request), but it seems that conversion is required during the second request. The documentation on DataOutputStream specifies that int and short are written in big-endian. The server does not apply conversion.
Hence, in the end, this is the question: Is it possible that C++ could change the Host-Format during execution? How could this possibly happen? What I can do to have predicible behaviour on the byte ordering between java client and c++ server?