2

I'm porting an application from 32 bit to 64 bit.
It is C style coding (legacy product) although it is C++. I have an issue where a combination of union and struct are used to store values. Here a custom datatype called "Any" is used that should hold data of any basic datatype. The implementation of Any is as follows:

typedef struct typedvalue
{
long data; // to hold all other types of 4 bytes or less
short id; // this tells what type "data" is holding
short sign; // this differentiates the double value from the rest
}typedvalue;

typedef union Any 
{
double any_any;
double any_double; // to hold double value
typedvalue any_typedvalue;
}Any;

The union is of size 8 bytes. They have used union so that at a given time there will only be one value and they have used struct to differentiate the type. You can store a double, long, string, char, float and int values at any given time. Thats the idea. If its a double value, the value is stored in any_double. if its any other type, then its stored in "data" and the type of the value is stored in the "id". The "sign" would tell if value "Any" is holding a double or another type. any_any is used liberally in the code to copy the value in the address space irrespective of the type. (This is our biggest problem since we do not know at a given time what it will hold!)

If its a string or pointer "Any" is suppose to hold, it is stored in "data" (which is of type long). In 64 bit, here is where the problem lies. pointers are 8 bytes. So we will need to change the "long" to an equivalent 8 byte (long long). But then that would increase the size of the union to 16 bytes and the liberal usage of "any_any" will cause problems. There are too many usage of "any_any" and you are never sure what it can hold.

I already tried these steps and it turned unsuccessful:
1. Changed the "long data" to "long long data" in the struct, this will make the size of the union to 16 bytes. - This will not allow the data to be passed as "any_any" (8 bytes).
2. Declared the struct as a pointer inside union. And changed the "long data" to "long long data" inside struct. - the issue encountered here was that, since its a pointer we need to allocate memory for the struct. The liberal use of "any_any" makes it difficult for us to allocate memory. Sometimes we might overwrite the memory and hence erase the value.
3. Create a separate collection that will hold the value for "data" (a key value pair). - This will not work because this implementation is at the core of application, the collection will run into millions of data.

Can anybody help me in this?

Reji
  • 21
  • 1
  • So you have a 64 bit platform where `sizeof(long) != 8` ? – Alnitak Jun 15 '11 at 09:00
  • @Alnitak nope, this is on windows and the sizeof(long) = 4. I will need to change this to long long n thats where my problem lies – Reji Jun 15 '11 at 09:28

3 Answers3

1

"Can anybody help me" this sounds like a cry of desperation, and I totally understand it.

Whoever wrote this code had absolutely no respect for future-proofing, or of portability, and now you're paying the price.

(Let this be a lesson to anyone who says "but our platform is 32bit! we will never use 64bit!")

I know you're going to say "but the codebase is too big", but you are better off rewriting the product. And do it properly this time!

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
1

Ignoring that fact that the original design is insane, you could use <stdint.h> (or soon <cstdint> to get a little bit of predictability:

struct typedvalue
{
  uint16_t id;
  uint16_t sign;
  uint32_t data;
};

union any
{
  char any_raw[8];
  double any_double
  typedvalue any_typedvalue;
};

You're still not guaranteed that typedvalue will be tightly packed, since there are no alignment guarantees for non-char members. You could make a struct Foo { char x[8]; }; and type-pun your way around, like *(uint32_t*)(&Foo.x[0]) and *(uint16_t*)(&Foo.x[4]) if you must, but that too would be extremely ugly.

If you are in C++0x, I would definitely throw in a static assertion somewhere for sizeof(typedvalue) == sizeof(double).

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • In the former solution, `data` can't hold a 64 bit pointer, so it should be a `uint64_t`. Also, the raw access could be changed to `char any_raw[sizeof(typedvalue)]` to cope possible packing issues – king_nak Jun 15 '11 at 09:17
  • Good point. I somehow thought we need to preserve the 8-byte structure. Hm, I don't really know why you'd want the nested struct like that... I guess you could roll it all up into a single unit: `union { char any_raw[16]; double any_double; struct { /* other specialised stuff */ }; };`. I guess long doubles and `int128_t`s are possible common types, too, so might as well go all the way to 128 bits. – Kerrek SB Jun 15 '11 at 09:36
  • And now all data takes 32 bytes each instead of 8. I can see why we need a 64-bit address space to handle all that! :-( – Bo Persson Jun 15 '11 at 09:53
  • @Bo: Once the LHC discovers wide electrons, we'll be fine :-) But how do you get to 32 rather than just 16? Alignment? – Kerrek SB Jun 15 '11 at 10:05
  • @Kerrek SB: I need to keep this to 8 bytes because this address space is accessed as a double (8 bytes) across the code base so if i have to grow this union beyond 8 bytes i need to figure out a way to pass this value across also – Reji Jun 15 '11 at 11:02
  • @Kerrek - Perhaps I just can't count properly? Dividing 128 by 8 I got 32. You mean it is just 16? :-) – Bo Persson Jun 15 '11 at 11:24
  • @Bo: It was before coffee, but now you're making me unsure. Wait. 128 bit divided by 8 bit / byte ... 16 byte? x86-ints have 4 byte, x64-ints have 8 bytes, doubles have 8 bytes... and if we go for 16 bytes, we can accommodate both int128_t's and 10-byte long doubles? I would not bet 128 reputation on it, but I'm fairly sure that works out. Am I missing something? :-S – Kerrek SB Jun 15 '11 at 12:15
0

If you need to store both an 8 byte pointer and a "type" field then you have no choice but to use at least 9 bytes, and on a 64-bit system alignment will likely pad that out to 16 bytes.

Your data structure should look something like:

typedef struct {
    union {
        void   *any_pointer;
        double  any_double;
        long    any_long;
        int     any_int;
    } any;
    char        my_type;
} any;

If using C++0x consider using a strongly typed enumeration for the my_type field. In earlier versions the storage required for an enum is implementation dependent and likely to be more than one byte.

To save memory you could use (compiler specific) directives to request optimal packing of the data structure, but the resulting mis-aligned memory accesses may cause performance issues.

Alnitak
  • 334,560
  • 70
  • 407
  • 495