1

I have a function that takes as its input a string containing a hyperlink and is attempting to output that same hyperlink except that if it contains a question mark, that character and any characters that follow it are purged.

First, I open a text file and read in a line containing a link and only a link like so:

FILE * ifp = fopen(raw_links,"r");
char link_to_filter[200];

if(ifp == NULL)
{
    printf("Could not open %s for writing\n", raw_links);
    exit(0);
}

while(fscanf(ifp,"%s", link_to_filter) == 1)
{
    add_link(a,link_to_filter, link_under_wget);
}; 

fclose(ifp);

Part of what add_link does is strip the unnecessary parts of the link after a question mark (like with xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/online-giving/step1.php?new=1) which are causing an issue with my calls to wget. It does this by feeding link_to_filter through this function remove_extra, seen below.

char * remove_extra(char * url)
{
    char * test = url;
    int total_size;

    test = strchr(url,'?');

    if (test != NULL)
    {
        total_size = test-url;
        url[total_size] = '\0';
    }

    return url;
}

at the end of remove_extra, upon returning from remove_extra and immediately prior to using strcpy a call to printf like so

printf("%s",url);

will print out what I expect to see (e.g. xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/online-giving/step1.php without the '?' or trailing characters), but immediately after this block of code runs

struct node * temp = (struct node *)malloc(sizeof(struct node));
char * new_link = remove_extra(url);
temp->hyperlink =(char *)malloc(strlen(new_link) * sizeof(char));
strncpy(temp->hyperlink, new_link, strlen(new_link));

the result of a printf on member hyperlink occasionally has a single, junk character at the end (sometimes 'A' or 'Q' or '!', but always the same character corresponding to the same string). If this were happening with every link or with specific types of links, I could figure something out, but it's only for maybe every 20th link and it happens to links both short and long.

e.g. xxxxxxxxxxxxxxxxxxxx/hr/ --> xxxxxxxxxxxxxxxxxxxx/hr/!

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/windows-to-the-past/ --> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/windows-to-the-past/Q

This is happening both with strcpy and homemade string copy loops so I'm inclined to believe that it's not strcpy's fault, but I can't think of where else the error would lie.

cancub
  • 23
  • 2
  • 5
  • Besides the unsafe string usage in remove_extra you should also use a safer form of fscanf with %as (and a string pointer address) or %199s. – eckes Mar 24 '15 at 21:29
  • Just for my own edification, what is unsafe about the string usage in remove_extra? – cancub Mar 25 '15 at 19:23
  • Well yes its not remove_extra, its the caller. It will search behind the allocated 200 bytes of the argument if there was no terminating \0 until then. – eckes Mar 25 '15 at 22:00

2 Answers2

5

If you compute len as the length of src, then the naive use of strncpy -- tempting though it might be -- is incorrect:

 size_t len = strlen(src);
 dest = malloc(len);       /* DON'T DO THIS */
 strncpy(dest, src, len);  /* DON'T DO THIS, EITHER */

strncpy copies exactly len bytes; it does not guarantee to put a NUL byte at the end. Since len is precisely the length of src, there is no NUL byte within the first len bytes of src, and no NUL byte will be inserted by strncpy.

If you had used strcpy instead (the supposedly "unsafe" interface):

strcpy(dest, src);

it would have been fine, except for the fact that dest is not big enough. What you really need to do is this:

dest = malloc(strlen(src) + 1);  /* Note: include space for the NUL */
strcpy(dest, src);

or, if you have the useful strdup function:

dest = strdup(src);
rici
  • 234,347
  • 28
  • 237
  • 341
  • That worked like a charm. And here I thought I was safe with the addition of the NUL byte in remove_extra. Newbie hubris, I suppose. In any case, thanks very much for the reponse. – cancub Mar 24 '15 at 22:08
2

Mostly likely You forgot to copy a null terminator at the end of string, remember that strlen(str) gives the number of visible characters, You also need '\0' at the end. You need to do

temp->hyperlink =(char *)malloc(strlen(new_link)+1);//char is one byte, sizeof(char)=1

and strcpy(temp->hyperlink, new_link); Should work fine.

Why not use strdup?

riodoro1
  • 1,246
  • 7
  • 14