Recently I have been trying to learn Semantic Web. For a project I need to retrieve data from a given dbPedia link. e.g http://dbpedia.org/page/Berlin . But when retrieve data using java.net.URLConnection I get the html data. How can I get the xml from the same link ? I know that there is link in every dbpedia page to download the XML but that is not what I want to do. Thanks in advance.
-
Why do you expect to get another format from exactly this URL? Wouldn’t it be possible for you to request a different URL (which could be automatically converted given `http://dbpedia.org/page/Berlin`)? – unor May 16 '15 at 23:59
-
May be I am wrong, but what if I need to get data from the predicate URI ,which could be a different one than dbpedia, then the conversion (changing 'page' to 'data' and appending a .rdf at the end) explained in some answer may not work. – sajid May 17 '15 at 00:27
-
@user3708999 It's not clear what you mean in your last comment. What do you mean by "need to get data from the predicate URI"? – Joshua Taylor May 19 '15 at 19:53
2 Answers
Note that the URI of the resource is actually http://dbpedia.org/resource/Berlin (with resource, not page). Ideally, you could request that URI with an Accept header of application/rdf+xml and get the RDF/XML representation of the resource. That's how the BBC publishes their data (e.g., see this answer), but DBpedia doesn't do that. Even if you request application/rdf+xml, you end up getting a redirect. You can see if you try with an HTTP client. E.g., using Advanced Rest Client in Chrome, we get this 303 redirect:
In a web browser, you get redirected to the page version by a 303 See Other response code. Ideally, you could request the resource URI with the accept header set to application/rdf+xml and get the data, but DBpedia doesn't place quite so nicely.
So, that means that the easiest way is to note that at the bottom of http://dbpedia.org/page/Berlin, there's the text with some download links:
The URL of the last link is http://dbpedia.org/data/Berlin.rdf. Thus, you can get the RDF/XML by changing page or resource to data, and appending .rdf to the end of the URL. It's not the most ReSTful solution, but it seems to be what's available.

- 1
- 1

- 84,998
- 9
- 154
- 353
-
Thanks for your suggestion. But this is not the solution i am looking for, indeed it is just a way around. I want something which can change the HTTP header `Accept: text/html;q=0.5, application/rdf+xml` via UrlConnection. So that I am automatically redirected to the RDF resource instead of HTML – sajid May 17 '15 at 00:10
-
@user3708999 I understand that, but there are two problems. 1. The URL would actually need to be dbpedia.org/resource/Berlin, not .../page/Berlin, since that's the actual resource. 2. While that would be the most ReSTful solution, and it's what some providers do (e.g., the BBC, see update to answer), DBpedia doesn't do that. Just changing the Accept header won't get you the data you want. – Joshua Taylor May 17 '15 at 00:12
-
@user3708999 I've updated my answer to show that even if you change the accept header, you **won't** get the data you want. – Joshua Taylor May 17 '15 at 00:17
-
@user3708999 Believe me, I really wish that DBpedia had better access to some of its data. It's frustrating to see so many Semantic Web principles in action, and then so many completely ignored. – Joshua Taylor May 17 '15 at 00:21
-
@user3708999 If this ends up being the approach that you take, do consider [accepting the answer](http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work). – Joshua Taylor May 17 '15 at 00:30
-
@JoshuaTaylor could you update you answer? DBpedia seems to support this now? curl -H "Accept: application/rdf+xml" -L http://dbpedia.org/resource/Brooklyn_Bridge – Nilesh Sep 29 '15 at 16:33
The good to access data from dbpedia is through Sparql
. You can use Apache Jena to run sparql
queries against http://dbpedia.org/sparql

- 5,278
- 43
- 65
- 115

- 103
- 8