how to simply open an url and read the data from a webpage with D? (I prefer phobos over tango, if needing to use standard lib functionality)
Asked
Active
Viewed 512 times
2 Answers
4
curl is in the standard library. You can fetch a url pretty easily like this:
import std.net.curl;
string content = get("d-lang.appspot.com/testUrl2");
http://dlang.org/phobos/std_net_curl.html#get
If you need to parse html, I wrote a dom library that is pretty good at it. https://github.com/adamdruppe/misc-stuff-including-D-programming-language-web-stuff
grab dom.d and characterencodings.d then you can:
import arsd.dom;
auto document = new Document();
document.parseGarbage(content); // content is from above, the html string
writeln(document.title); // the <title> contents
auto paragraph = document.querySelector("p");
if(paragraph is null)
writeln("no paragraphs in this document");
else
writeln("the first paragraph is: ", paragraph.innerText);
and so on. If you've used javascript dom api, this is pretty similar (though expanded in a lot of ways too).

Adam D. Ruppe
- 25,382
- 4
- 41
- 60
-
Unfortunately, for the pure "std.net.curl" code example above, I run into the problem with the linker order generated by dmd, for linux, discussed here: http://forum.dlang.org/thread/mailman.1605.1334108859.4860.digitalmars-d@puremagic.com?page=2 and here: http://forum.dlang.org/thread/cwgxdvkvsnbwvbgrdivp@forum.dlang.org ... but obviously not yet fixed in 2.0.61 :( – Samuel Lampa Jan 01 '13 at 17:48
-
Ok, so I got around this by compiling with: "dmd [filename] -L-lphobos2 -L-lcurl". (Oh, and well, I needed to add a cast as well, that was not mentioned in the docs: "string s = cast(string) get("[url]");" ... since get returns a char[] rather than a string.) – Samuel Lampa Jan 01 '13 at 18:00
-
string s = get("[url]").idup;? – 0b1100110 Jan 01 '13 at 22:08
3
I think std.net.curl bindings are your best bet, specifically its get/post methods (example is in the docs): http://dlang.org/phobos/std_net_curl.html#get
After all, curl is designed specifically for this kind of tasks and bindings are part of phobos.

Mihails Strasuns
- 3,783
- 1
- 18
- 21
-
Ah, thx! (Though, unfortunately not part of the phobos version in Ubuntu/LinuxMint repos yet :/ ) – Samuel Lampa Jan 01 '13 at 15:52
-
if the phobos one isn't available to you, in my githib (link in the other answer here) you can grab my curl.d and use "string content = curl("http://example.com/foo.html");" If you don't have libcurl at all, I also have an http.d in there with a simple get() function. – Adam D. Ruppe Jan 01 '13 at 16:10
-
For some reason, I get "Error: function expected before (), not module curl of type void", with my little program: "import curl; string s = curl("example.org");", and placing the curl.d in the same folder as my other .d files ... Any hints? – Samuel Lampa Jan 01 '13 at 16:52
-
use "import arsd.curl;" rather than just import curl;. Also when compiling, put all the files on the dmd command line: "dmd yourfile.d curl.d" to avoid linker errors. – Adam D. Ruppe Jan 01 '13 at 17:06
-
Samuel, ah, than you are most likely using gdc from base repos, 4.4 one? It must have a rather old version of phobos shipped with and with current bug fixing tempo for D/phobos I'd really recommend to get latest releases straight from the devs unless there are specific requirements. – Mihails Strasuns Jan 01 '13 at 18:15