0

code realize function that reading file(contain lots of urls) ,every url pass through "evhttp_uri_parse" getting host and path.But it has a error that evhttp_uri_parse parse fail ,return NULL。Possibly reason is a stack overflow.

FILE *fp=fopen(argv[1],"rb");
if(NULL==fp)
{
    printf("open url_file is error %d::%s\n",errno,strerror(errno));
    return 0;
}
char url_buf[2048];
memset(url_buf,'\0',sizeof(url_buf));

fgets(url_buf,sizeof(url_buf),fp);
while(!feof(fp))
{
    if(strlen(url_buf)>1)
    {
        printf("url_buf::%s",url_buf);
        #if 1 
        struct evhttp_uri *ev_uri=NULL;
        ev_uri=evhttp_uri_parse(url_buf);
        if(ev_uri==NULL)
        {
            printf("parse uri  error::%d,%s\n",errno,strerror(errno));
        }
        const char *host=evhttp_uri_get_host(ev_uri);
        const char *path=evhttp_uri_get_path(ev_uri);
        printf("query host::%s,path::%s\n",host,path);
        evhttp_uri_free(ev_uri);
        #endif
    }
    memset(url_buf,'\0',sizeof(url_buf));
    fgets(url_buf,sizeof(url_buf),fp);
}
fclose(fp);
sanwuhai
  • 1
  • 2

2 Answers2

0
  1. fgets(url_buf,sizeof(url_buf)+1,fp) should be changed to fgets(url_buf,sizeof(url_buf),fp)

  2. fgets adds '\n' at the end of the string. Try to remove it and see if it helps.

Matt
  • 13,674
  • 1
  • 18
  • 27
0

if your url for any reason greater than 2048 character size then fgets will not completely return you the url you wanted and return you a part of it (with 2047 character) with a null character at 2048'th location only.

so thats why it's a bad idea to put sizeof(url_buf)+1. it will lead to undefined behavior since you will be accessing a location which is out of bound to url_buf array.

so check whether you got a string with newline character and change it to a null character, if you didn't get a newline character in the string then you might want to read until you get a newline to get the complete url.

this is applicable only if your url's are delimited by newline.

Sridhar Nagarajan
  • 1,085
  • 6
  • 14
  • Thanks your advice,but main reason is not element. – sanwuhai Feb 12 '15 at 01:00
  • @sanwuhai you should post the error and the output you are getting, so we can get to know, what problem you are facing, we are posting solutions assuming that the function `evhttp_uri_parse(url_buf)` is working fine, and the only way it can go wrong is because of the wrong input. – Sridhar Nagarajan Feb 12 '15 at 08:05