What RDF patterns can be used to represent components and the percentage they make up?

Question

I'd like to inventory my wine collection using RDF but am not sure how to specify that wine can contain percentages of several grape varietals. Below is an attempt to do so in Turtle syntax using rdf:bag.

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix vin: <http://example.org/wine#> .

<http://example.org/wine/id#1001>
  a <http://example.org/wine/ns#red> ;
  vin:name "Quilceda Creek CVR" ;
  vin:vintage "2014"^^xsd:gYear ;
  vin:winery "Quilceda Creek"@en ;
  vin:alcoholContent "0.15"^^xsd:decimal  ;
  vin:agedIn "French Oak"@en ;      

  vin:varietals rdf:_1, rdf:_2, rdf:_3, rdf:_4, [
    a rdf:Bag ;
    rdf:_1 "Cabernet Sauvignon"@en ;
    rdf:_1 "0.76"^^xsd:decimal ;
    rdf:_2 "Merlot"@en ;
    rdf:_2 "0.20"^^xsd:decimal ;
    rdf:_3 "Petit Verdot"@en ;
    rdf:_3 "0.03"^^xsd:decimal ;
    rdf:_4 "Malbec"@en ;
    rdf:_4 "0.01"^^xsd:decimal ;
  ] .

When I convert this to XML/RDF, the triples with percentages get dropped. This makes me think you shouldn't/can't use the bag item predicates (ex. rdf:_1) more than once.

I've also considered making a bag of bags, with a bag for each varietal containing the name and percentage. This would involve creating even more blank nodes, which doesn't seem right to me. Eventually I would like to be able to retrieve all wines containing at least a certain percentage of a particular varietal. I'm not sure if I'll be able to if varietal name and percentage pairs have no relationship defined other than being in the same bag.

I'm new to this but have a feeling I need to look to RDF Schemas and ontologies for this problem. That said, I also don't want to jump ship to that until I totally understand why I need to.

If possible, how can RDF be used to represent that a wine has certain percentages of different varietals?

You have to use a blank node or URI for each item in the bag, then you can attach the corresponding information to the item, e.g. `rdf:_1 vin:entry1` and `vin:entry1 rdfs:label "Cabernet Sauvignon"@en ; vin:grapePortion "0.76"^^xsd:decimal .` — UninformedUser, May 27 '17 at 10:42
This works great. An additional change I made was to replace `rdf:Bag` with `rdf:Seq` so that as entered, varietals will be ordered from greatest to least percentage. — Kelly, May 27 '17 at 17:08

Stanislav Kralin · Accepted Answer · 2017-05-29T07:57:01.547

4

I’d prefer to use this simple pattern:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix wine: <http://www.w3.org/TR/2003/PR-owl-guide-20031209/wine#> .
@prefix vin: <http://example.org/wine#> .

vin:id1001 vin:varietal [ vin:grape  wine:CabernetSauvignonGrape; 
                          vin:percentage  "0.76"^^xsd:decimal ] ;
           vin:varietal [ vin:grape  wine:MerlotGrape ;
                          vin:percentage  "0.20"^^xsd:decimal ] .

Example SPARQL queries against the pattern above would be:

SELECT DISTINCT ?sophistique
WHERE {
    ?sophistique vin:varietal/vin:percentage ?percentage .
    FILTER (?percentage <= "0.05"^^decimal)
}

SELECT DISTINCT ?coupage
WHERE {
    ?coupage vin:varietal/vin:grape ?grape1.
    ?coupage vin:varietal/vin:grape ?grape2.
    FILTER (?grape1 != ?grape2)
}

SELECT ?id (("1.0"^^xsd:decimal - SUM(?percentage)) AS ?part_des_anges)   
WHERE {
    ?id vin:varietal/vin:percentage ?percentage .
} GROUP BY ?id HAVING ( ?part_des_anges > "0.0"^^xsd:decimal )

Some remarks:

It is more ideologically correct to use, wherever possible, things instead of strings in RDF.
The W3C’s example wine ontology could provide URIs for many of these things.
Why don’t you use just multiple occurrences of the vin:varietal property instead of rdf:Seq? It will be harder to deal with rdfs:Container’s in SPARQL and especially in OWL.
I don’t think these varietals (grape varieties with percentages) need strong identification with URIs, their “ontological status” are not sufficiently solid. Thus, I use blank nodes.

edited May 29 '17 at 07:57

answered May 28 '17 at 13:39

Stanislav Kralin

11,070
4
35
58

1

The W3C's wine ontology has been good inspiration. I will take the advice to use things in place of strings, where possible. Thank you very much for the sample queries. These demonstrate the association between grape variety and percentage as being sufficiently strong. I believe I understand why the use of easily read and understood property paths, as given, would be preferred over what `rdfs:containers` would require. – Kelly May 29 '17 at 19:10
@Kelly, I think `rdf:Seq` will be appropriate when describing cocktails etc. – Stanislav Kralin May 29 '17 at 19:18
Interesting idea but I'm not sure I understand. Do you mean that the components of a wine, its grape varietal(s), are that much different from the components of a cocktail? Cocktail recipes do not usually list ingredients with percentages but could just as well. – Kelly May 29 '17 at 19:29
@Kelly, I mean layered cocktails, where exact order of layers is important. E. g., Bloody Mary: vodka first, tomato juice second (as well as I remember). Though, probably, one can deduce proper order of ingredients from their densities, i. e. dispense with `rdf:Seq`. – Stanislav Kralin May 29 '17 at 19:44
I think I understand now. Wine is made only of grapes whereas cocktails (and beer for that matter) are composed of more than one kind of thing. I will also consider that similar to how `vin:varietal` is used, `beer:yeast`, `beer:grain`, `beer:hops`, and `beer:otherAdditives` properties could be used. – Kelly May 29 '17 at 19:45

What RDF patterns can be used to represent components and the percentage they make up?

1 Answers1

Linked