2

In main, I pass a string to a different function that is supposed to separate the string, then work with each substring. In this case, I need to take a 30 character string and separate it into substrings of length 7, 5, 5, 7, and 6 to be manipulated later. This is what I started trying:

void breakString(const char *lineStr) {
        char a[7] = " "; //I tried with them all initialized empty and without doing so.
        char b[5];       //Didn't seem to make a difference.
        char c[5];
        char d[7];
        char e[6];

        //sscanf(lineStr, "%7s", &a);     //tried sscanf at first, but didn't know how to 
        strncpy(a, lineStr, 7);           //scan the middle so i switched to strncpy
        strncpy(b, lineStr + 7, 5);
        //continue this pattern for c,d,e

        (rest of function here, where each substring is manipulated accordingly.)

I tested the first bit by printing substrings a and b (and also by strcmp() them to the correct output), but it doesn't fully work. I keep getting extra gibberish. For example, if the full string passed is "abcdefghijklmnopqrstuvwxyz1234", then a should be "abcdefg", b should be "hijkl", and so on. However, when I print a, it comes out as "abcdefg^#@%^&" with some random assortment of characters following each substring.

What am I doing wrong? Or are there better ways to implement this differently?

J...S
  • 5,079
  • 1
  • 20
  • 35
Reol
  • 21
  • 1
  • 1
    strncpy does not null terminate, so what you are seeing is overflow. See https://randomascii.wordpress.com/2013/04/03/stop-using-strncpy-already/ for explanation – FJT May 01 '18 at 06:32
  • Strings in C have a zero byte at the end. If you have an array like `a[7]`, and you `strncpy(a, lineStr, 7)`, what you get is an array of characters. It is not a string because it doesn't have the zero byte at the end. So you can't pass it to string functions like `strcmp`, and you can't print it with a simple `%s` format specifier. You could print it with `%.7s` – user3386109 May 01 '18 at 06:34
  • 1
    `strncpy` is evil. Always use `strlcpy`. If it's not available on your system, find the source yourself and make it available. `strncpy` is not and never was intended for copying strings, contrary to what its name implies. – Roflcopter4 May 01 '18 at 06:43

3 Answers3

0

1) sscanf()

With sscanf(), you could do

sscanf(lineStr, "%7c%5c%5c%7c%6c", a, b, c, d, e);
a[7]=b[5]=c[5]=d[7]=e[6]='\0';

%c can be used to read more than 1 byte. %7c will read upto 7 bytes. But \0 won't be automatically added.

Thanks goes to chqrlie for this method.

or just

sscanf(lineStr, "%7s%5s%5s%7s%6s", a, b, c, d, e);

if lineStr doesn't have white spaces in it.

Or maybe

sscanf(lineStr, "%7[^\n]%5[^\n]%5[^\n]%7[^\n]%6[^\n]", a, b, c, d, e);

if lineStr doesn't have \n characters.

where the numbers in the format string denote the width of the substrings to be copied.

In this way, you needn't \0 terminate the strings manually. sscanf() will take care of it.


2) strncpy()

If you must use `strncpy(), you are on the right track. You could do

void breakString(const char *lineStr) {
    char a[8];
    char b[6];      
    char c[6];
    char d[8];
    char e[7];

    strncpy(a, lineStr, 7);
    a[7]='\0';
    lineStr+=7;

    strncpy(b, lineStr, 5);
    b[5]='\0';
    lineStr+=5;

    strncpy(c, lineStr, 5);
    c[5]='\0';
    lineStr+=5;

    strncpy(d, lineStr, 7);
    d[7]='\0';
    lineStr+=7;

    strncpy(e, lineStr, 6);
    e[6]='\0';
    //lineStr+=6;
}

Note that an extra one byte is needed to store the \0 character for the strings. So the sizes of the arrays are changed accordingly.

Community
  • 1
  • 1
J...S
  • 5,079
  • 1
  • 20
  • 35
  • The `sscanf` format is incorrect. The string could contain white space that would cause unexpected behavior. You should use `sscanf(lineStr, "%7c%5c%5c%7c%6c", a, b, c, d, e);` – chqrlie May 01 '18 at 06:57
  • Also the safest way to fix code with `strncpy` is to remove all such calls. – chqrlie May 01 '18 at 07:07
  • @chqrlie Why did you suggest `%c` in the `sscanf()`? `a` and `b` must have strings, right? Also, I tried it and got unexpected output that looks like garbage. – J...S May 01 '18 at 10:02
  • `sscanf(s, "%7c", a);` will store up to 7 bytes from `s` into `b`, which must point to a sufficiently large array of `char`. A feature so seldom used, I have never seen it in action, despite having implemented `sscanf()` myself a few different times! The gotcha is `sscanf()` will not store a null terminator so the destination array must be initialized already. The default width of `1` is more commonly used, and a pointer to a single `char` is appropriate for that. – chqrlie May 01 '18 at 10:35
  • @chqrlie Whoa! I didn't know that. It does work! Just need to append a `0` at the end of the `char` array. Thanks for telling this. This is a rare piece of knowledge. – J...S May 02 '18 at 06:56
  • 1
    `scanf` is such a piece of work, so many quirks and shortcomings and so widely misused. – chqrlie May 02 '18 at 15:11
0

I keep getting extra gibberish...

This is because strncpy() does not append null-character implicitly at the end of destination if the source is longer than the size passed. A string, in C language, is a null-terminated array of character.

Hence, after this:

strncpy(a, lineStr, 7);

if the source is longer than the size passed then you need to add the null-character at the end, like this:

a[7] = '\0';

The buffer size should be +1 to accommodate the null-character at the end of the buffer:

char a[8];
char b[6];      
char c[6];
char d[8];
char e[7];

You should try to avoid using strncpy() because you need to manually take care of appending null-character. Instead, use something which guarantees to null-terminate the destination always, like snprintf(). You can do:

char a[8];
snprintf(a, 8, "%s", lineStr);

You don't need to append terminating null-character, it's automatically appended after the content written. Read more about snprintf() here.


Additional:

The way you are trying to initialize the empty array is not correct:

char a[7] = " "; 

This is not the empty array but this will actually initialize the first element of array (a[0]) with the space character and rest of the elements will be initialized with 0. To initialize the empty array, you can do:

char a[8] = {0};

This will initialize all the elements of the array with 0.

H.S.
  • 11,654
  • 2
  • 15
  • 32
0

Your problem can be solved with strncpy, but you should never use this function as its precise semantics are widely misunderstood and exceedingly error prone.

Read https://randomascii.wordpress.com/2013/04/03/stop-using-strncpy-already/ for information.

Furthermore, you should make the arrays one byte longer than the number of characters you plan to store into them for the null terminator.

Here is simple solution for your case:

#include <stdio.h>

void breakString(const char *lineStr) {
    char a[7+1] = ""; /* destination strings must be initialized */
    char b[5+1] = ""; /* because the %c conversion specifier */
    char c[5+1] = ""; /* will set a null terminator. */
    char d[7+1] = "";
    char e[6+1] = "";

    if (strlen(lineStr) >= 7+5+5+7+6 &&
        sscanf(lineStr, "%7c%5c%5c%7c%6c", a, b, c, d, e) == 5) {
        /* string was long enough, fields correctly initialized */
        printf("a: %s\nb: %s\nc: %s\nd: %s\ne: %s\n", a, b, c, d, e);
    }
}

int main() {
    breakString("abcdefghijklmnopqrstuvwxyz0123456789");
    return 0;
}

Output:

a: abcdefg
b: hijkl
c: mnopq
d: rstuvwx
e: yz0123

While this solution is simple and concise, I would advise you to take a different approach, with a utility function. Indeed the sscanf solution uses a very unusual set of conversion specifiers that will make most programmers raise eyebrows and reject it. Furthermore, it does not lend itself to extracting variable numbers of characters into appropriately sized subarrays.

Here is a different approach:

#include <stdio.h>

size_t getchunk(char *dest, size_t n, const char *str) {
    size_t i;
    for (i = 0; i < n && *str; i++) {
        dest[i] = *str++;
    }
    dest[i] = '\0';
    return i;
}

void breakString(const char *lineStr) {
    char a[7+1];
    char b[5+1];
    char c[5+1];
    char d[7+1];
    char e[6+1];
    size_t pos = 0;

    pos += getchunk(a, 7, lineStr + pos);
    pos += getchunk(b, 5, lineStr + pos);
    pos += getchunk(c, 5, lineStr + pos);
    pos += getchunk(d, 7, lineStr + pos);
    pos += getchunk(e, 6, lineStr + pos);

    if (e[0] != '\0') {
        /* string was long enough, fields correctly initialized */
        printf("a: %s\nb: %s\nc: %s\nd: %s\ne: %s\n", a, b, c, d, e);
    }
}

int main() {
    breakString("abcdefghijklmnopqrstuvwxyz0123456789");
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189