0

The program is made in C++, and it indexes webpages, so all domains are random domain names from the web. The strange part is that the dns fail/not found percentage is small (>5%).

here is the pmp stack trace:

   3886 __GI___poll,send_dg,buf=0xADDRESS,__libc_res_nquery,__libc_res_nquerydomain,__libc_res_nsearch,_nss_dns_gethostbyname3_r,gaih_inet,__GI_getaddrinfo,Curl_getaddrinfo_ex
    601 __GI___poll,Curl_socket_check,waitconnect,singleipconnect,Curl_connecthost,ConnectPlease,protocol_done=protocol_done@entry=0xADDRESS),Curl_connect,connect_host,at
    534 __GI___poll,Curl_socket_check,Transfer,at,getweb,athread,start_thread,clone,??
    498 nanosleep,__sleep,athread,start_thread,clone,??
     50 __GI___poll,Curl_socket_check,Transfer,at,getweb,getweb,athread,start_thread,clone,??
     15 __GI___poll,Curl_socket_check,Transfer,at,getweb,getweb,getweb,athread,start_thread,clone
      7 nanosleep,usleep,main

Why are there so many threads at _nss_dns_gethostbyname3_r? What could I do to speed it up.

Could it be because I'm using curl's default synchronous DNS resolver with CURLOPT_NOSIGNAL?

The program is running on a intel I7 (8 cores HT), 16GB ram, Ububtu 12.10.

The bandwidth varies from of 6MB/s (ISP limit) -> 2MB/s at an irregular interval, and it sometimes even drops to a few 100KB/s.

Stefan Rogin
  • 1,499
  • 3
  • 25
  • 41

2 Answers2

1

The threads you are seeing are probably waiting for DNS answers. A way of speeding that up would be to do the looking up beforehand, so they get cached in your neighbor recursive DNS server. Also make sure nobody is asking for autoritative answers, that is slow always.

vonbrand
  • 11,412
  • 8
  • 32
  • 52
  • All settings are default, I've just added some dns servers and set them to rotate. I've tested the system for up to 2000 dns queries/s and all is good, but it seems that curl might be handling dns differently. How can I tell if curl is asking authoritative answers? also could I specify to not ask for IPV6(AAAA) as my network doesn't support it yet. – Stefan Rogin Apr 15 '13 at 16:39
  • What is a `neighbor recursive DNS server`? also could you expand the lookup beforehand method? – Stefan Rogin Apr 15 '13 at 16:41
0

I've found that the solution was to change the default curl dns resolver to c-ares and to specifically ask for ipv4 as ipv6 is not supported yet by my network.

Changing to c-ares also allowed me to add more set dns servers and to circle them in order to improve the number of dns queries/s.

The outcome:

//set to ipv4 only
curl_easy_setopt(curl, CURLOPT_IPRESOLVE, CURL_IPRESOLVE_V4);

//cicle dns Servers
dns_index=DNS_SERVER_I;
pthread_mutex_lock(&running_mutex);
    if(DNS_SERVER_I>DNS_SERVERS.size())
    {
        DNS_SERVER_I=1;
    }else
    {
        DNS_SERVER_I++;
    }
pthread_mutex_unlock(&running_mutex);

string dns_servers_string=DNS_SERVERS.at(dns_index%DNS_SERVERS.size())+","+DNS_SERVERS.at((dns_index+1)%DNS_SERVERS.size())+","+DNS_SERVERS.at((dns_index+2)%DNS_SERVERS.size());

// set curl DNS (option available only when curl is built with c-ares)
curl_easy_setopt(curl, CURLOPT_DNS_SERVERS, &dns_servers_string[0]);
Stefan Rogin
  • 1,499
  • 3
  • 25
  • 41