-1

As a part of our end semester project we are required to implement a distributed chat system. The system needs to be scalable and robust. Keeping these criteria in mind I am confused as to how do we send a vector object over the socket.

Since the vector is dynamically allocated sending it's object as such would not work as the memory to which it points is not copied. In order to accomplish this serialization would be the best option. But, as required by our project we are not suppose to use any third party libraries such as Boost and Google Protocol Buffers.

Hence to serialize the vector object and send it over the network I cannot seem to find a starting guide explaining how to proceed. Also are there any other alternatives that we can use for this ?

The vector would be containing strings(IP Address:Port) of each member in the chat group.

Any help would be great. Thank You.

NOTE: We are required to make the chat client run on a cluster and I believe in order to make the system robust and scalable we need to take into account the endianess also.

bitmask
  • 32,434
  • 14
  • 99
  • 159
Dhruv Arya
  • 95
  • 1
  • 12
  • Nothing wrong with storing data in a vector, copying the contents should be quick as the contents should be aligned in memory – EdChum Apr 08 '12 at 16:48
  • 1
    What type does the vector contain? If it's a POD type, and you can guarantee that the machines at both ends use the same endianness, packing and alignment, then you can get away with just sending the raw binary content. If not, then you will need something more sophisticated. – Oliver Charlesworth Apr 08 '12 at 16:51
  • The vector will hold strings(IP Address:Port). This is a member table indicating which are the current active participants in the chat. So it has to be multicast to the group. – Dhruv Arya Apr 08 '12 at 16:55
  • er, knowing the meaning of the data in your vector doesn't actually help us with this one; we need to know what the actual types are; please provide *code* – SingleNegationElimination Apr 08 '12 at 16:57
  • 1
    Sounds like you just need a simple container stream. Start with the vector size (count of strings), followed by each vector elemement string (length followed by character array). You can end the container stream with a CRC or other checksum to add some error detection. On the receiving side just read that back into a string vector. – Amardeep AC9MF Apr 08 '12 at 17:00
  • @TokenMacGuy - I cannot post the code as the University has a strict academic integrity policy. – Dhruv Arya Apr 08 '12 at 17:03
  • @Amardeep How would I send it over the socket, after I read in the vector into the stringstream I can just output the values into the socket and then read it at the other end ? And since it is a UDP socket would that be an issue in this ? – Dhruv Arya Apr 08 '12 at 17:05
  • @DhruvArya: Then construct some new code that is representative of your actual code. – Oliver Charlesworth Apr 08 '12 at 17:08
  • If the strings are just IPv4 and a port you can send one string per datagram and they will always fit. So come up with a protocol that has a datagram indicating start of transmission and includes the number of strings to follow. Then send one datagram per string. – Amardeep AC9MF Apr 08 '12 at 17:09
  • @Amardeep sending one datagram per string to all members from the leader notifying them would flood the network. If there are n members then there would be (n-1)*(n-1) datagrams that would be sent out. – Dhruv Arya Apr 08 '12 at 17:17
  • @OliCharlesworth I would do so. – Dhruv Arya Apr 08 '12 at 17:17
  • Backing up a bit, it seems you are trying to figure out a way to design your protocol based on your data structure implementation. I don't think your problem is how to serialize/deserialize a vector but rather how to design an efficient protocol to communicate host/port pairs over UDP. – Amardeep AC9MF Apr 08 '12 at 17:35
  • @Amardeep That is right the design of our protocol requires that the leader maintain a member table of the current members present in the chat and periodically inform the members of the group of any updates to this member table. This is required to handle the failure of the leader. The member addresses are being stored in the vector which has to be multicast to the members, so that they have a copy of the table. Hence in order to achieve this we are trying to use a vector. – Dhruv Arya Apr 08 '12 at 17:43

1 Answers1

0

If you want binary serialization in this case, you need to implement serialization for 2 types -- integer and string. Integer can be easily written byte by byte, by casting it to char and then shifting:

// assuming 32 bit ints and 8 bit bytes
int integer = 1337;
unsigned char data[4];
for(int i = 0; i < 4; ++i)
    data[i] = (unsigned char) (integer >> 8*i);

Deserialize by sum and shift:

int integer = 0;
for(int i = 3; i >= 0; ++i)
{
    integer += data[i];
    integer <<= 8;
}

(I didn't test the code, so trace through it in a debugger and make sure it does what I think it does :))

Serialized string would be serialized size and then characters on the stream.

Vector would then be a combination of those 2 -- size of vector, then strings one by one.

You might want to add magic word and checksum to make sure client know what to expect and how to verify the data. If you want to get really fancy, implement your own backing for ASN.1 or something. :)

Eugene
  • 7,180
  • 1
  • 29
  • 36
  • Thanks I think there is only this option with me. I would have to think of some sort of format for this and then send the messages over the network using UDP. – Dhruv Arya Apr 09 '12 at 00:31