I am using the SWI-Prolog library(http/http_open)
. According to the docs, "After [http_open(Url, Stream, [])
] succeeds the data can be read from Stream." Thus, I thought maybe I could rig up a simple, declarative predicate to parse phrases from URL's by using phrase_from_stream/2
in library(pure_input)
:
phrase_from_url(Url, Phrase) :-
http_open(Url, In, []),
phrase_from_stream(Phrase, In),
close(In).
But I suspect there is some nuance to the kinds of stream provided by http_open/3
; I receive the following error:
ERROR: set_stream_position/2: stream `<stream>(0x7feebbf5c810)' does not exist (Device not configured)
(I have tested the same url against the example provided on the library(http/http_open)
docs, which uses copy_stream_data/2
to pipe the output to user_output
, and it works. So I know the url is not at fault.)
I have learned that I can download the data from the url into a string, code-list, or text file, and then use a phrase/n
, our cousin, on that. But I'm hoping someone can help inform me about...
- ...an elegant/standard solution to parsing data from a url with DCGs
- ...maybe some insight into why we cannot use
phrase_from_stream/2
on some streams, as one might naively hope.
s and such, parsing the html followed by xpath does the job quite well. See for example here: https://github.com/StanfordOSAcademySWIProlog/contentteam/blob/master/scrape.pl The relevant code is in `scrape/3` and its help predicates.
– Feb 18 '15 at 06:11