0

i am doing a mall project in python. i have a table with the columns: "author", "title", "text".

What i need is: for a given author name, i want somehow to get a table with the columns: "property", "value", containing the information (rows) about the properties "occupation" and "sex or gender" from Wiki-data.

*EDIT: the type of the table doesn't really matter. a dataframe would be great, but every other type that works is just fine!

Example: for the author name- David David, i want to get the following table:

  • row1: property = "occupation" ; value = David's occupation.
  • row2: property = "gender or sex" ; value = David's gender.

thanks :-)

anavarro
  • 35
  • 4
  • When you say table, what do you mean? A dataframe, a dictionary? Data format matters. – powerPixie Oct 19 '19 at 22:17
  • doesn't really matter. dataframe could be great, but i'll handle with the other types too. I edited the post and mentioned it, sorry for not specifying all details :-) – anavarro Oct 19 '19 at 23:55
  • Actually, it matters. A dataframe has it's own methods and many times they differ from dictionaries, tuples or any other way you have your data presented. Your example looks more like a dictionary (Person = { occupation:'dentist',gender:'male'}. Have you tried yourself a solution code? Stackoverflow is not a code for free club. We try to support each other, but don't do each others jobs. – powerPixie Oct 20 '19 at 10:26
  • i'm not trying to ask somebody to "do my job".. i posted it because i am inexperienced with that area, and hopefully while i'm looking myself i can get from here great tips or directions to look for. for now, the type of the structure that i want to hold in the python doesn't matter, because it's less difficult to handle / convert between the types. – anavarro Oct 21 '19 at 09:18

1 Answers1

2

ok, so here is a way (that i've seen) to do so:

using requests and json. for example: if i want to get the birthday, occupation and gender of Donald Trump, i should first import requests package, and define my SPARQL query. (WikiData database can be queried using SPARQL query language)

import requests

sparql_query = """
        prefix schema: <http://schema.org/>
        SELECT ?item ?occupation ?genderLabel ?bdayLabel
        WHERE {
            <https://en.wikipedia.org/wiki/Eric_P._Schmitt> schema:about ?item .
            ?item wdt:P106 ?occupation .
            ?item wdt:P21 ?gender .
            ?item wdt:P569 ?bday .
            SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
        }
    """

(doesn't matter who Eric P. Schmitt really is..) then, i need to make the request and apply and query using the request.get method:

url = 'https://query.wikidata.org/sparql'

# sleep(2)
r = requests.get(url, params={'format': 'json', 'query': sparql_query})

and final step, get the results as a json and access to the wanted information from it's structure:

url = 'https://query.wikidata.org/sparql'

r = requests.get(url, params={'format': 'json', 'query': sparql_query})
data = r.json()

print(data['results']['bindings'])
>>>> [{'item': {'type': 'uri', 'value': 'http://www.wikidata.org/entity/Q5387230'}, 'genderLabel': {'xml:lang': 'en', 'type': 'literal', 'value': 'male'}, 'bdayLabel': {'type': 'literal', 'value': '1959-11-02T00:00:00Z'}, 'occupation': {'type': 'uri', 'value': 'http://www.wikidata.org/entity/Q1930187'}}]

anavarro
  • 35
  • 4