0

INSIDE THE CUDA KERNEL

Suppose I have a byte that can have a binary value between 0 and 255.

I have a character array (char *) length three:

char * c = (char *) malloc(300000000*sizeof(char)); // 30 mb

Short of the following (as in, I would like to rule out “solutions” that involve a manual byte to char representation):

switch(my_byte){
    case 0:
       c[0] = '0';
    case 1:
       c[1] = '1';
    ...
    case 255:
       c[0] = '2';
       c[1] = '5';
       c[2] = '5';
}

How do I convert the byte to a char * style string in a Cuda kernel?

Chris
  • 28,822
  • 27
  • 83
  • 158
  • Use `std::to_string` and then copy the string into the buffer? – NathanOliver Sep 28 '18 at 21:56
  • @NathanOliver I'm not sure this would work in the cuda kernel... Is there an NVCC function `to_string`? All `std::` operations are not valid on the kernel. It is basically C with namespaces and structs with functions, but not much else. – Chris Sep 28 '18 at 21:57
  • is my_byte is a byte array or just a single byte? – Yucel_K Sep 28 '18 at 22:11
  • @Yucel_K It is a single byte in an array of bytes. Just byte -> c style string number for each of a set of bytes in cuda device code. I will do the giant switch statement, would be ten min in vi but there must be a better way... – Chris Sep 28 '18 at 22:18
  • https://stackoverflow.com/q/36878640/681865 – talonmies Sep 29 '18 at 05:12
  • 1
    who ever downvoting an answer please leave a reason why you downvoted so I can understand what's been done wrong. – Yucel_K Sep 29 '18 at 06:17

2 Answers2

1

This is my solution, for now, in an effort to avoid the flow control issue in the vectorized code.

/*! \brief byte to raw chars; this is not a string! */
__device__ void byte_to_chars(uint8_t b,char * str_arr_ptr){
  uint8_t buf[4];

  buf[0] = b / 100;
  buf[1] = (b % 100 - b % 10) / 10;      
  buf[2] = b % 10;

  buf[3] = 3 - !buf[0] + !buf[0]*!buf[1]; // size

  // buf[3] = sz
  // 3 - buf[3] = missing digits; i.e., 1 for 023, 2 for 003
  for(int i = 0; i < buf[3]; i++) str_arr_ptr[0][i] = buf[ i + 3 - buf[3] ]+'0';              

  // modify function signature as needed -- i.e., return
  // useful info 
}

However, a solution based on library calls would be best.

Chris
  • 28,822
  • 27
  • 83
  • 158
  • Most of the C++ (or even C) standard library functionality is not available, at the moment, in CUDA code; so you're unlikely to get the solution you're interested in. Also, you're using magic numbers a lot; and this solution will be difficult to adapt to larger numeric types. – einpoklum Sep 30 '18 at 16:21
  • @einpoklum yes, that is established. however, there remained the question of whether there existed good, maintained solutions for converting bytes to strings *edit:* list of chars. A simple example for which you would want a maintained solution is cross machine communication, which may be arbitrarily different in terms of endianness, nibble order, etc., and for which maintained code exists on the host side -- and which I happen to be dealing with. – Chris Sep 30 '18 at 16:23
  • @einpoklum in the meantime, this solution is far more servicable than the non-working and more complex code you have posted below, after a pedantic lecture on out of scope material. – Chris Sep 30 '18 at 16:25
-1

First, don't use malloc() for a small, fixed amount of space; use an array. Second, don't switch, and in general, in kernel code, try to avoid diverging control paths. Finally, if it's supposed to be a C-style string, it needs to end with '\0'.

So consider something like:

#include <cstdint>

enum { max_digits = 3, modulus = 10 };
struct stringized_byte_t {
    char[max_digits+1] buffer;
}

stringized_byte_t stringize_a_byte(uint8_t my_byte)
{
    uint8_t digits[max_digits];
    uint8_t num_digits = 1;

    uint8_t remainder = my_byte;
    while(remainder >= modulus) {
       uint8_t dividend = remainder / modulus;
       digits[num_digits - 1] = remainder - dividend * modulus;
       num_digits++;
       remainder = dividend;
    }

    // at this point we have one digit left (although it might be 0),
    // and we know the overall number of digits, so:

    digits[num_digits - 1] = remainder;

    // Now we need to flip the digit direction to fit the printing order,
    // and terminate the string

    stringized_byte_t sb;
    for(int i = 0; i < num_digits; i++) {
       sb.buffer[i] = '0' + digits[num_digits - i - 1];
    }
    sb.buffer[num_digits] = '\0';
    return sb;
}

Note I used C-style coding rather than "pimping up" the class, so you can very easily convert this code into proper C.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • The pointer is correct for this use case; I will not be using an array. Why are you starting at position 1? Use “digit index” and start at zero for consistency. – Chris Sep 30 '18 at 10:33
  • Either way, thanks for the heads up with respect to the flow control. – Chris Sep 30 '18 at 10:36
  • Also, you’ve got an infinite while loop — your first while. – Chris Sep 30 '18 at 11:35
  • @bordeo: You're right about the infinite loop, fixed that. About position 1 - That's because if `my_byte` is 0, you want to say it has 1 digit - for stringification, anyway. – einpoklum Sep 30 '18 at 16:16
  • @bordeo: Also, you state using a pointer "is correct" for your use-case, but you've not said why that is. Your self-answer suggests memory is allocated elsewhere and you need to print into it - but your question didn't say that. – einpoklum Sep 30 '18 at 16:18
  • Actually, the question was looking for a library solution rather than a "roll your own" byte to string representation (which I state in the question), so everything related to strings was out of scope. In fact the entire answer was out of scope. – Chris Sep 30 '18 at 16:19