1

I am trying to get a list of all of Kurt Cobain's quotes from the mediawiki api. I have:

https://en.wikiquote.org/w/api.php?format=json&action=query&srsearch=Kurt+Cobain&list=search

BUT, it doesn't seem to give me any of his quotes as shown here...nor does it provide a good format to be able to parse easily.

How do I get a list of all of his quotes using the API? If possible would also like to include the source - e.g. From an interview on MTV with Zeca Camargo, 1993-01-21, Rio de Janeiro, Brazil

Would prefer the API directly but an answer with pywikibot is also good.

user_78361084
  • 3,538
  • 22
  • 85
  • 147
  • @Pascalco This kinda works but seems like there is still a lot of processing to do to weed out what is and isn't a quote: https://en.wikiquote.org/w/api.php?format=json&action=query&titles=Kurt+Cobain&prop=extracts – user_78361084 Aug 15 '20 at 04:06
  • @Tgr not sure how to format into a list – user_78361084 Aug 15 '20 at 04:22

1 Answers1

1

There is no structured data like templates to get the quotes. All you can do is to retrieve quotes via regex from plain wikitext, something like:

>>> import pywikibot
>>> s = pywikibot.Site('en', 'wikiquote')
>>> p = pywikibot.Page(s,'Kurt Cobain')
>>> t = p.text
>>> for quote in t.splitlines():
        if quote.startswith('* '):
            print(quote)
xqt
  • 38
  • 2