3

I downloaded an rdf file of the format .ttl - I am new to RDF and I am trying to see if I can get the data in a simple txt/csv format of some sort. Does anyone know how to do this?

Henri Wathieu
  • 121
  • 3
  • 14

1 Answers1

7

RDF has a very simple data model: it's just subject predicate object. You can see this by converting your file to n-triples:

 $ rdfcopy myfile.ttl # apache jena

 $ rapper -i turtle myfile.ttl  # rapper (part of librdf)

But this is limited. Suppose you start with the nice looking turtle file:

 @prefix ex: <http://example.com/>

 <Brian> ex:age 34 ;
         ex:name "Brian Smith" ;
         ex:homepage <http://my.name.org/Brian> .

 <Delia> ex:age 45 ;
         ex:name "Delia Jones" ;
         ex:email <mailto:delia@deliajones.com> .

The result is:

<file:///tmp/Delia> <http://example.com/email> <mailto:delia@deliajones.com> .
<file:///tmp/Delia> <http://example.com/name> "Delia Jones" .
<file:///tmp/Delia> <http://example.com/age> "45"^^<http://www.w3.org/2001/XMLSchema#integer> .
<file:///tmp/Brian> <http://example.com/homepage> <http://my.name.org/Brian> .
<file:///tmp/Brian> <http://example.com/name> "Brian Smith" .
<file:///tmp/Brian> <http://example.com/age> "34"^^<http://www.w3.org/2001/XMLSchema#integer> .

In other words everything is reduced to three columns.

You might prefer running a simple sparql query instead. It will give you tabular results of a more useful kind:

prefix ex: <http://example.com/>

select ?person ?age ?name
where {
    ?person ex:age ?age ;
            ex:name ?name .
}

Running that using apache jena's arq:

$ arq --data myfile.ttl --query query.rq 
---------------------------------
| person  | age | name          |
=================================
| <Delia> | 45  | "Delia Jones" |
| <Brian> | 34  | "Brian Smith" |
---------------------------------

which is probably more useful. (You can specify CSV output too by adding --results csv).

(The librdf equivalent is roqet query.rq --data myfile.ttl -r csv)

user205512
  • 8,798
  • 29
  • 28
  • This is very helpful, however I am stuck at the first step of using Apache Jena - how do I go from downloading the binary release stuff and utilizing this code in terminal? – Henri Wathieu Mar 09 '15 at 22:06
  • Unpack it somewhere and add apache-jena-VERSION/bin to you your path (if you're on windows I'm not sure). But there are equivalent libraries for all major languages, so you so might want to pick one for the language you like best. – user205512 Mar 10 '15 at 00:15
  • I am getting the following: `Users-MacBook-Pro:~ User$ arq --data /Users/User/Downloads/chembl_20.0_assay.ttl --query query.rq --results csv File not found: query.rq`. I regret that I am totally new to all things bash & rdf - any help? – Henri Wathieu Mar 10 '15 at 14:19
  • Oh, sorry that wasn't clear: save that query above (the bit after 'useful kind:'... ) as 'query.rq'. – user205512 Mar 10 '15 at 14:38
  • I don't know how kosher this is on stackoverflow, but the following link to an FTP contains the file in question (`chembl_20.0_assay.ttl.gz`) [link](ftp://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBL-RDF/20.0/) - if you could briefly look at it and let me know how to best set up the query file for this turtle stuff, that would be much appreciated! – Henri Wathieu Mar 10 '15 at 15:22
  • Use a simple text editor, e.g. `nano query.rq`, paste the text, ctrl-o, ctrl-x. – user205512 Mar 10 '15 at 15:25
  • Start a new question for that :-) – user205512 Mar 10 '15 at 15:29
  • Ok. but regarding the first part where you utilize `rdfcopy`, is there a simple way to go from there to csv output? – Henri Wathieu Mar 10 '15 at 15:40
  • Afraid not, although it's a very simple line-oriented format. – user205512 Mar 10 '15 at 15:59