-1

I have an array defined as

int data[k];

where k is the size of the array. Each element of the array is either 0 or 1. I want to save the binary data in another array defined as

uint8_t new_data[k/8];

(k is usually a multiple of 8).
How can I do this in C?

Thanks in advance

zahra
  • 175
  • 1
  • 10
  • 1
    "Each element of the vector is binary." you mean that each element is either 0 or 1? Is k guaranteed to be multiple of 8? – Matteo Italia Oct 13 '18 at 13:11
  • 1
    array, not vector. – Shawn Oct 13 '18 at 13:12
  • 1
    Also: what order should the bits be packed in? – Matteo Italia Oct 13 '18 at 13:13
  • 1
    Unclear what you're asking... of all of the other questions that are related to serialisation in C, **what have you tried** before writing this answer? Did you try using modulo/division or binary and/left-shift in conjunction with assignment? – autistic Oct 13 '18 at 13:16
  • If you didn't think of any of those ideas, why not? Which book are you reading? You are reading a book, right? Because it's dangerous to learn C as a "mystery black box"; what you end up learning is not C but some subset of C which misbehaves when you migrate it to a different system configuration, or for some other trivial reason like an OS update... There are other demons with this question... you need to spend time editing your question, drafting it like you're submitting it to a team for review and will be scrutinized... because as it currently stands, it's confusing to say the least. – autistic Oct 13 '18 at 13:28
  • Generally you can't store something larger in something smaller without cutting it – 0___________ Oct 13 '18 at 13:43
  • There are two different readings of your question: (1) you have a bunch of 8-bit numbers `4a 27 e5 2e 73 bf 39 8f` and you want to group them into 32-bit numbers `4a27e52e 73bf398f` (2) you have a bunch of binary bits `0 0 1 1 0 1 0 1 0 1 1 0 1 0 1 1 1 0 0 1 0 1 0 0` and you want to group them into 8-bit numbers `11100000 00000100 00011110` or `e0 04 1e`. Which do you mean? (The two answers that have been posted so far try to answer (2).) – Steve Summit Oct 13 '18 at 14:02

2 Answers2

0

Assuming new_data starts initialized at 0, data[i] contains only zeroes and ones and that you want to fill lowest bits first:

for(unsigned i = 0; i < k; ++i) {
    new_data[i/8] |= data[i]<<(i%8);
}

A possibly faster implementation1 may be:

for(int i = 0; i < k/8; ++i) {
    uint8_t o = 0;
    for(int j = 0; j < 8; ++j) {
        o |= data[i*8]<<j;
    }
    new_data[i] = o;
}

(notice that this essentially assumes that k is multiple of 8)


  1. It's generally easier to optimize, as the inner loop has small, known boundaries and it writes on a variable with just that small scope; this is easier for optimizers to handle, and you can see for example that with gcc the inner loop gets completely unrolled.
Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • @autistic: OP said "Each element of the vector is binary", so I'm assuming it's either 0 or 1; if that's not the case, adding a `!!` in front of `data[i]` will fix this (but probably slow it down). – Matteo Italia Oct 13 '18 at 13:20
  • 1
    Heh... fair enough... another reason this question ... needs more time. That word "binary"; it doesn't have the same single meaning, anymore. I wonder why OP has also tagged `[hex]`... – autistic Oct 13 '18 at 13:24
  • I notice you asked about this (and OP has not yet answered) prior to answering based on the assumption that you're certain... so why ask? I suppose it makes sense that people will abuse StackOverflow as it's now a competitive network. Still, it seems to me as though answering when you're not entirely certain has always gone against the spirit of the network... – autistic Oct 13 '18 at 13:35
  • Also, the question is tagged `[C]` and not `[gcc]`. "The semantic descriptions in this International Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant." -- [C11/5.1.2.3p1](https://port70.net/~nsz/c/c11/n1570.html#5.1.2.3p1); "In the abstract machine, all expressions are evaluated as specified by the semantics." -- [C11/5.1.2.3p4](https://port70.net/~nsz/c/c11/n1570.html#5.1.2.3p4). C doesn't mandate that optimisation, and besides, it might not even be an optimisation; it's irrelevant and invalid; there's no point mentioning it in this answer. – autistic Oct 13 '18 at 13:42
  • @autistic: that's just being pedant; one writes code for compilers, not in abstract. The second proposal is not just tailored towards gcc (it's just an example), it's more compiler-friendly in general, as compilers can easily see through the inner loop (it has known, smaller bounds and works on a local variable with a small scope); try it with other compilers and you'll see that in pretty much all the cases they'll manage to churn out smarter code. That being said, I reworded that part a bit to incorporate this explanation. – Matteo Italia Oct 13 '18 at 13:48
  • Matteo, to the contrary, one should always seek to write code that's **maintainable**. Once we have a solution which is functional yet too slow, we can utilise our profilers to measure the parts of runtime that are too slow. This way, we don't end up guessing which loops to unroll, potentially mistargeting an optimisation and thrashing the cache and branch prediction aspects of the processor. You should profile and optimise per configuration, not per program. Finally, if it's possible that this might be false in the future, do you want to own that possible false-hood? – autistic Oct 13 '18 at 13:53
  • @autistic: that's why I provided first a simple solution, and then a possible optimization (which, by the way, doesn't seem to me particularly less maintainable). Also, you'll notice that I didn't unroll the loop manually - I left it to the compiler. I just provided code that does pretty much the same thing but _is easier to understand for the compiler_, so that it can apply its heuristics and generate better code, be it in this toy test case or when inlined inside a bigger function, targeting today's x86 or whatever may come out in the next 10 years. – Matteo Italia Oct 13 '18 at 13:58
  • Okay. I'm just saying, I think the answer would be better without the potentially invalid and entirely irrelevant advice to focus on implementation-specific optimisation guesswork. To me, that seems to be an answer to a different question... – autistic Oct 13 '18 at 14:07
  • You spoke of local variables... are you aware that both of your pieces of code use registers for that loop (which are equivalent in that sense), and that the simpler is more optimal (owing to less instructions)? The simpler code is also the more maintainable code... If you want to eliminate some register usage (which is a pretty good optimisation), consider using `while (k--) {` instead of `for(unsigned i = 0; i < k; ++i) {`... micro-optimisation, I know... but at least it actually reduces the instruction count. – autistic Oct 13 '18 at 14:23
  • "-funroll-loops Unroll loops whose number of iterations can be determined at compile time or upon entry to the loop. -funroll-loops implies -frerun-cse-after-loop. This option makes code larger, and *may or may not make it run faster*." ... you should focus on space optimisation (i.e. reducing code size and complexity) and personal time optimisation (because a few clock cycles mean nothing compared to one of your heart beats) and on that note, peace out... – autistic Oct 13 '18 at 14:27
  • @autistic: the "local variables" thing is about the target variable. In the first version, the target of OR-ing is an array element, while in the second is a variable that is born and dies with the outer loop iteration. This is generally better because you do one eighth of memory accesses (read+write) to the target array, and because OR-ing to a local allows the optimizer to be more aggressive in its assumptions. Less instructions = more optimal is blatantly false on any superscalar architecture, and on recent x86 particularly so. – Matteo Italia Oct 13 '18 at 14:44
  • Reducing register usage is useful when you are out of registers, so the compiler has to spill them on the stack, which is generally problematic only when it happens in your innermost loops (and even then, architectures with register renaming mitigate it even more). That being said, it's a valid optimization that I would consider after profiling. Of course unrolling may not make your code faster, and one may be interested in optimizing for code size - that's why, again, I didn't unroll manually, but I just provided code that is easier for the optimizer to understand. – Matteo Italia Oct 13 '18 at 14:48
  • That being said, again, I see no need to make such a fuss about this. You have a trivial implementation and one slightly more complicated that generally will yield better performance. Whoever will meet this answer in future will be free to use one or the other, based on his needs/his profiling results. – Matteo Italia Oct 13 '18 at 14:52
  • Strange that you would suggest loop unrolling prior to profiling, but not register spilling... They're both micro-optimisations which could be hit and miss. One hinges upon your CPUs register selection, the other on your CPUs L1i cache. One can't "generally" thrash the cache and perform better. Nonetheless, I hope you do follow the procedure of building a prototype (with *human efficiency* in mind, as opposed to *CPU efficiency*) before you optimise it (per platform, or better yet per hardware configuration), and I hope you use a profiler to validate your assumptions regarding optimisation. – autistic Oct 13 '18 at 21:26
  • @autistic: for the umpteenth time, I didn't unroll; I gave the compiler better tools to work with it if it thinks it's advantageous, based on heuristics or profile-guided optimization (and BTW, this kind of unrolling is generally advantageous, compiler writers aren't idiots). If you cannot understand it, that's your problem. This conversation is leading to nothing, I'm not going to reply anymore. – Matteo Italia Oct 14 '18 at 08:42
  • What's advantageous on one hand may be disadvantageous on another... and I've already mentioned that, so it seems like we've come full circle and it's unfortunate for both of us that StackOverflow doesn't have an "ignore" functionality... huh? I hardly see it as *my problem* that you're not able to show a sensible argument for polluting your code with *premature optimisations* and *micro-optimisations*. That's your nose you're cutting off... peace! – autistic Oct 15 '18 at 11:34
0

Assuming k is a multiple of 8, assuming that by "each element is binary" you mean "each int is either 0 or 1", also assuming the bits in data are packed from most significant to least significant and the bytes of new_data are packed as big-endian (all reasonable assumptions), then this is how you do it:

for (int i = 0; i < k/8; ++i)
{
    new_data[i] = (data[8*i  ] << 7) | (data[8*i+1] << 6)
                | (data[8*i+2] << 5) | (data[8*i+3] << 4)
                | (data[8*i+4] << 3) | (data[8*i+5] << 2)
                | (data[8*i+6] << 1) | data[8*i+7];
}
autistic
  • 1
  • 3
  • 35
  • 80
Cyber
  • 857
  • 5
  • 10
  • There are a lot of assumptions built into this answer. In the future, please ask for clarification, as it's possible that this might not even answer the question that was intended. – autistic Oct 13 '18 at 13:50