0

I use following command to get (latest) version of certain software from Wikidata from the command line using jq as JSON parser.

curl -sL "http://www.wikidata.org/w/api.php?action=wbgetentities&ids=$QID&languages=en&format=json" |
jq ".entities.$QID.claims.$PID""[0].mainsnak.datavalue.value" 

where $QID is the ID of a Wikidata entry, and $PID the ID of the property we want to print (in this case software version "P348").

That usually works fine, since usually the first claim ([0]) for P348 is the newest version, but for example for "Q13166" (WordPress), we have several claims about software version. How to get the newest stable one instead of the first claim?

I probably should (in the case we have more than one claim), find the ith claim, where version type "P548" equals to stable version "Q12355314". Or find out the clam that has preferred rank. How to do that with jq? Is there an easier way, such as for example sending an SPARQL query to query.wikidata.org?

logi-kal
  • 7,107
  • 6
  • 31
  • 43
smihael
  • 863
  • 13
  • 29
  • 1
    It is unsafe to interpolate values into jq programs directly; instead, use `--arg` to pass them as jq variables: `jq --arg qid "$QID" --arg pid "$PID" '.entities[$qid].claims[$pid][0].mainsnak.datavalue.value'` –  Jun 18 '16 at 16:31
  • Thanks for the tipp, I didn't know that command. The original question remains though. – smihael Jun 18 '16 at 16:39

1 Answers1

1

The following collects all the relevant version numbers, and determines the "maximum" value using the filter defined here as "lexmax":

jq --arg QID "$QID" --arg PID "$PID" '
  def lexmax:
    map( split(".")
         | map(if test("^[0-9]*$") then tonumber else . end) )
  | max | map(tostring) | join(".");

  .entities | .[$QID] | .claims | .[$PID]
  | map(.mainsnak.datavalue.value)
  | lexmax'

The result with QID=Q13166 PID=P348 is

"4.5.2"

If you want to use .rank == "preferred" as a selection criterion, you could use the following:

def lexmax:
  map( split(".")
       | map(if test("^[0-9]*$") then tonumber else . end) )
  | max | map(tostring) | join(".");

def when(condition; action):
  if condition? // null then action else . end;

.entities | .[$QID] | .claims | .[ $PID ]
| map( select(when(has("rank"); .rank == "preferred"))
       | .mainsnak.datavalue.value)
| lexmax

Or perhaps you won't need to use lexmax ...

peak
  • 105,803
  • 17
  • 152
  • 177
  • I voted up, but won't yet accepted it as it would fail for Chrome (https://www.wikidata.org/wiki/Q777, as it has claims about unstable (beta) versions too. Currently there are three claims: 48.0.2564.109 50.0.2661.102 m (preffered) 51.0.2704.47 dev-m (beta) The relevant entry has an additional field "rank" set to preffered. See:http://pastebin.com/Mkb06U86 I think one should really look for the entry where, there "rank" is set to "preferred" , then if there are more entries without rank (or the same rank) find the maximum, and if there is only claim one return it directly. – smihael Jun 18 '16 at 19:20
  • The other thing is, that it won't work for strings where version is not numeric. jq: error (at :0): Invalid numeric literal at line 1, column 4 (while parsing '50+ (build 160526)') LimeSurvey (wd:Q1514978) In standard unix shell git es command `sort --version-sort` which would do the right thing, but I do not know if `jq` supports this, too? – smihael Jun 18 '16 at 19:34
  • I've updated the answer to indicate how `.rank == "preferred"` can be used as a selection criterion, and to show how "lexmax" can be made more robust. You might wish to write your own filter to reflect the selection logic you have in mind. – peak Jun 18 '16 at 20:14
  • I actually tried using prefilter match("[0-9]+(.[0-9]*)?";"g") | .string and then finding the maximum on it, but yours seems more straight forward. – smihael Jun 18 '16 at 20:19
  • As regards the `map( select ( ... ) )`, it would throw an error `jq: error (at :0): Cannot iterate over null (null)` when no object has some rank., even with the condition. One example to look at is Drupal (Q170855) or LimeSurvey (Q1514978). I tried other ordering: `when(has("rank"); map( select( .rank == "preferred")))` but it throws a syntax error. Is there any other identity operator than `.`? – smihael Jun 19 '16 at 07:31
  • The code that worked for me in the end was: `if map(has("rank")) then if map(select(.rank == "preferred")) | length > 0 then map(select(.rank == "preferred").mainsnak.datavalue.value) else map(select(.rank == "normal").mainsnak.datavalue.value) end else . end` Note that sometimes, the normal rank is preferred when all others are marked as deprecated. – smihael Jun 19 '16 at 08:16