I'm trying to implement a Gossip style membership protocol for an online cloud computing course assignment, and I've run into a strange issue. When members enter the network, they report to an 'introducer node' which sends them the network membership table. To do this, the first step of the assignment is to implement JOINREQ and JOINREP messages.
New node comes in -> sends JOINREQ to introducer. Introducer receives JOINREQ -> replies with JOINREP containing membership table.
I've implemented these messages, but seemingly randomly some of the nodes are getting garbage data in their membership table when they receive it. Here's an example:
So with the problem described, here is the code
bool MP1Node::recvCallBack(void *env, char *data, int size ) {
MessageHdr *msg;
msg = (MessageHdr*) data;
if(msg->msgType == JOINREQ)
{
long *newMemberHeartBeat = (long*)malloc(sizeof(long));
Address *replyAddr = (Address*)malloc(sizeof(Address));
memcpy(replyAddr, (char *)(msg+1), sizeof(Address));
memcpy(newMemberHeartBeat, (char *)(msg+1) + 1 + sizeof(Address), sizeof(long));
//Add this member to member list
int id = replyAddr->addr[0];
short port = replyAddr->addr[4];
memberNode->memberList.emplace_back(id, port, *newMemberHeartBeat, (long)par->getcurrtime());
// create JOINREP message
size_t msgsize = sizeof(MessageHdr) + sizeof(memberNode->memberList) + sizeof(long) + 1;
msg = (MessageHdr *) malloc(msgsize * sizeof(char));
msg->msgType = JOINREP;
memcpy((char *)(msg+1), &memberNode->memberList, sizeof(memberNode->memberList));
emulNet->ENsend(&memberNode->addr, replyAddr, (char *)msg, msgsize);
//Printing to debuglog
#ifdef DEBUGLOG
string t = "Sending Table to " + replyAddr->getAddress();
log->LOG(&memberNode->addr, t.c_str());
for(auto it = memberNode->memberList.begin(); it != memberNode->memberList.end(); it++)
{
string s = "ID: " + to_string(it->getid()) + " Heartbeat: " + to_string(it->getheartbeat()) + " Timestamp: " + to_string(it->gettimestamp());
log->LOG(&memberNode->addr, s.c_str());
}
log->LOG(&memberNode->addr, "-----------------------");
#endif
free(msg);
free(newMemberHeartBeat);
free(replyAddr);
}
else if(msg->msgType == JOINREP)
{
memcpy(&(memberNode->memberList), (char *)(msg+1), size - sizeof(msg+1));
//Printing to debuglog
#ifdef DEBUGLOG
log->LOG(&memberNode->addr, "Received Table:");
for(auto it = memberNode->memberList.begin(); it != memberNode->memberList.end(); it++)
{
string s = "ID: " + to_string(it->getid()) + " Heartbeat: " + to_string(it->getheartbeat()) + " Timestamp: " + to_string(it->gettimestamp());
log->LOG(&memberNode->addr, s.c_str());
}
log->LOG(&memberNode->addr, "-----------------------");
#endif
}
}
The course instructors are very inactive on the discussion forums of this course, and I don't know where to go for help. Here the is the link to my github repo with more of the code and files, as well as the assignment spec. Github
I suppose the main question I want to ask is if there's something wrong with the way I'm constructing the message or if this is a result of node failure? It's a virtual network, so it's possible it's supposed to fail at these points. For further information, it's always the same nodes that report garbage values, but the values are different each time:
1st run
9.0.0.0:0 [4] Received Table:
9.0.0.0:0 [4] ID: 0 Heartbeat: 140736297693200 Timestamp: 1
9.0.0.0:0 [4] ID: 3 Heartbeat: 0 Timestamp: 1
9.0.0.0:0 [4] ID: 4 Heartbeat: 0 Timestamp: 1
9.0.0.0:0 [4] ID: 5 Heartbeat: 0 Timestamp: 2
9.0.0.0:0 [4] ID: 6 Heartbeat: 0 Timestamp: 2
9.0.0.0:0 [4] ID: 7 Heartbeat: 0 Timestamp: 2
9.0.0.0:0 [4] ID: 8 Heartbeat: 0 Timestamp: 2
9.0.0.0:0 [4] ID: 9 Heartbeat: 0 Timestamp: 3
2nd run:
9.0.0.0:0 [4] Received Table:
9.0.0.0:0 [4] ID: 0 Heartbeat: 140736544485392 Timestamp: 1
9.0.0.0:0 [4] ID: 3 Heartbeat: 0 Timestamp: 1
9.0.0.0:0 [4] ID: 4 Heartbeat: 0 Timestamp: 1
9.0.0.0:0 [4] ID: 5 Heartbeat: 0 Timestamp: 2
9.0.0.0:0 [4] ID: 6 Heartbeat: 0 Timestamp: 2
9.0.0.0:0 [4] ID: 7 Heartbeat: 0 Timestamp: 2
9.0.0.0:0 [4] ID: 8 Heartbeat: 0 Timestamp: 2
9.0.0.0:0 [4] ID: 9 Heartbeat: 0 Timestamp: 3
3rd run:
9.0.0.0:0 [4] Received Table:
9.0.0.0:0 [4] ID: 0 Heartbeat: 140736815972368 Timestamp: 1
9.0.0.0:0 [4] ID: 3 Heartbeat: 0 Timestamp: 1
9.0.0.0:0 [4] ID: 4 Heartbeat: 0 Timestamp: 1
9.0.0.0:0 [4] ID: 5 Heartbeat: 0 Timestamp: 2
9.0.0.0:0 [4] ID: 6 Heartbeat: 0 Timestamp: 2
9.0.0.0:0 [4] ID: 7 Heartbeat: 0 Timestamp: 2
9.0.0.0:0 [4] ID: 8 Heartbeat: 0 Timestamp: 2
9.0.0.0:0 [4] ID: 9 Heartbeat: 0 Timestamp: 3