1

Note: This is a project for homework, I will try to write the remaining code, but cannot figure out why this is unable to connect to an input URL.

I was given skeleton code that I modified a bit to receive an input URL. Expected usage could be: ./a.out http://google.com

For whatever reason it never succeeds in connecting. The error message "could not connect" always is printed. Later I will need to take a file from the URL and save it to the local directory but I will try to figure out how to do that (my guess is that it has to do with recv() in the code below). In the case of "http://google.com" I would be expected to take "index.html".

The skeleton code is using connect() but the man page for getaddrinfo() uses bind() which seems to be much faster but also is not working. Using connect() it never seems to leave the for loop (Edit: It never leaves because it seems to be stuck trying to connect):

#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main(int argc, char** argv) {
        // Alex: Input usage (expecting one URL)
        if (argc != 2) {
                printf("Usage: ./hw1 URL\n");
                exit(1);
        }

        // Alex: Set noHttp as argv[1] and remove "http://"
        char* noHttp = argv[1];
        char http[] = "http://";
        if (strlen(noHttp) > 7 && !strncmp(noHttp, http, 7)) noHttp += 7;
        else {
                printf("Invalid URL, expecting http://host/path\n");
                exit(1);
        }
        printf("%s\n", noHttp);

        struct addrinfo hints;
        struct addrinfo* result, * rp;
        int sock_fd, s;

        // Alex: I moved assigning hints.ai_socktype after memset()
        memset(&hints, 0, sizeof(struct addrinfo));
        //hints.ai_socktype = SOCK_STREAM;

        s = getaddrinfo(noHttp, "8080", &hints, &result); // To Stack Overflow: This changed to "80", I am leaving it here because there are comments about it
        if (0 != s) {
                perror("Error populating address structure");
                exit(1);
        }

        int i = 0;
        for (rp = result; rp != NULL; rp = rp->ai_next) {
                printf("i = %d\n", i);
                i++;

                //printf("rp->ai_flags = %d\n", rp->ai_flags);
                printf("rp->ai_family = %d\n", rp->ai_family);
                printf("rp->ai_socktype = %d\n", rp->ai_socktype);
                printf("rp->ai_protocol = %d\n", rp->ai_protocol);

                sock_fd = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
                printf("sock_fd = %d\n", sock_fd);
                if (sock_fd == -1) continue;

                // Success
                if (connect(sock_fd, rp->ai_addr, rp->ai_addrlen) != -1) break;
                close(sock_fd);
        }

        if (rp == NULL) {
                fprintf(stderr, "could not connect\n");
                exit(1);
        }

        freeaddrinfo(result);

        char buf[255];
        memset(&buf, 0, sizeof(buf));

        int recv_count = recv(sock_fd, buf, 255, 0);
        if (recv_count < 0) {
                perror("Receive failed");
                exit(1);
        }

        printf("%s",buf);
        shutdown(sock_fd, SHUT_RDWR);
        return 0;
}

Edit: I replaced "8080" with "80" as Uku Loskit recommended.

asimes
  • 5,749
  • 5
  • 39
  • 76
  • 1
    Works fine for me when connecting a local service. When any of your system calls return an error, you should access `errno` and print out an error message, otherwise you'll never know where it's going wrong. – Crowman Sep 01 '13 at 20:51
  • Why do you expect to find port 8080 open? The default for HTTP is 80. – tripleee Sep 01 '13 at 20:51
  • I don't know, I am brand new to this. I tried to look up differences between 8080 and 80 online but did not understand. It had `"::1` and `"8080"` originally – asimes Sep 01 '13 at 20:53
  • `::1` is just the IPv6 shorthand for `localhost`. It doesn't matter what it says "originally", if you're not trying to connect to a service that you don't know should be available, then no wonder it isn't working. – Crowman Sep 01 '13 at 20:54
  • Thank you, I changed this but it still does not work – asimes Sep 01 '13 at 20:59
  • @asimes: You say `bind()` "fails after many attempts in the for loop", but `bind()` isn't anywhere in your code, and shouldn't be, since `bind()` is for creating listening sockets. And again, have you changed your code to fetch the error from `errno` and `strerror()` so you can figure out where the problem is? Should be the very first thing you do, just looking at it and wondering why it's not working is not usually a productive approach to debugging. – Crowman Sep 01 '13 at 21:01
  • Ok, then I won't use `bind()`. I just saw it in the man page with the same arguments and saw it was faster than connect, I did not realize it served a different purpose – asimes Sep 01 '13 at 21:03
  • When I change your code to port 80, and run it on `http://google.com`, it works just fine, but hangs on the `recv()` call as you'd expect. Are you sure that it's stalled in the loop, and not on the `recv()` call? – Crowman Sep 01 '13 at 21:05
  • asimes, you need to send a GET request like in Sp's post before expecting to receive anything from the server. – Uku Loskit Sep 01 '13 at 21:07
  • @Paul Griffiths, you are right, I did not have a print after the loop, it was getting stuck at `recv()`... my bad. – asimes Sep 01 '13 at 21:09
  • @asimes: Glad you got it figured out. I added an answer, choose whichever one best answers your question. – Crowman Sep 01 '13 at 21:14

3 Answers3

2

Your program looks OK to me, run netcat on port 8080 and connet to the host:

$ echo "Hello" | ncat -l 8080

will return:

$ gcc -Wall sample.c 
$ ./a.out http://127.0.0.1
127.0.0.1
i = 0
rp->ai_family = 2
rp->ai_socktype = 1
rp->ai_protocol = 6
sock_fd = 3
Hello
$ 

in order to connect to HTTP, you need to send HTTP request first or it will block, add after the line 64:

    freeaddrinfo(result);

    send(sock_fd, "GET / HTTP/1.1\n\n", 16, 0); // HTTP request

    char buf[255];
    memset(&buf, 0, sizeof(buf));

this will send the request:

GET / HTTP/1.1

and change the port to 80, it should work:

$ ./a.out http://google.com
google.com
i = 0
rp->ai_family = 2
rp->ai_socktype = 1
rp->ai_protocol = 6
sock_fd = 3
HTTP/1.1 200 OK
Date: Sun, 01 Sep 2013 21:05:16 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151
$ 
  • Typing: `echo "Hello" | ncat -l 8080` seems to never finish. – asimes Sep 01 '13 at 21:10
  • @asimes Run `netcat` first and then connect. –  Sep 01 '13 at 21:12
  • It says, "This is nc from the netcat-openbsd package. An alternative nc is available in the netcat-traditional package. usage: nc [-46DdhklnrStUuvzC] [-i interval] [-P proxy_username] [-p source_port] [-s source_ip_address] [-T ToS] [-w timeout] [-X proxy_protocol] [-x proxy_address[:port]] [hostname] [port[s]]" – asimes Sep 01 '13 at 21:16
  • @asimes I use the `netcat`, `ncat` version from [`nmap`](http://nmap.org/), you have the classic one, `nc`. I guess the command should be `echo "Hello" | nc -l -p 8080`. –  Sep 01 '13 at 21:18
  • It just gives me the same message for `echo "Hello" | nc -l -p 8080`. I don't really understand what this is supposed to do – asimes Sep 01 '13 at 21:21
  • @asimes It means you program works. HTTP protocol will never send you something back until you send the HTTP request header first, `GET / HTTP/1.1\n\n`. –  Sep 01 '13 at 21:24
1

You should be connecting on port 80, not 8080. Port 80 is the default for HTTP.

Uku Loskit
  • 40,868
  • 9
  • 92
  • 93
1

When you change your port number to 80 and connect to http://google.com, it'll work as expected, but hangs on the recv() call because an HTTP server won't send anything to you until you ask it for something. Sp.'s answer gives you an example of how to do this by adding a send() call before your recv() call.

What's happening right now is you're connecting to it, and it's waiting for you to tell it what you want. What you are doing is just waiting for it to send you something with your recv() call, so you're both going to just wait until it times out.

Crowman
  • 25,242
  • 5
  • 48
  • 56
  • Still making sense of what his answer means, ha ha. Thanks again – asimes Sep 01 '13 at 21:19
  • You just need to add his `send()` call in. You need to send a request to the server before it'll give you anything. What's happening right now is you're connecting to it, and it's waiting for you to tell it what you want. What *you* are doing is just waiting for it to send you something with your `recv()` call, so you're both going to just wait until it times out. – Crowman Sep 01 '13 at 21:21