5

I'm using the wikipedia python library (https://pypi.org/project/wikipedia/), and in most cases, it seems to autocorrect the terms I'm using or something so that they're often wrong.

For instance, "frog" gets changed to "food" and "crown" gets changed to "cross":

input: wikipedia.page("frog")
output: <WikipediaPage 'Food'>

input: wikipedia.summary("Frog")
output: 'Food is any substance consumed to provide nutritional support for an organism..."

input: wikipedia.page("crown")
output: <WikipediaPage 'Cross'>

When using wikipedia.search, it seems to provide an appropriate list, but I don't know how to utilize this to get the correct page when using .summary, etc:

input: print(wikipedia.search("frog"))
output: ['Frog', 'FROG', 'The Princess and the Frog', 'Boiling frog', 'Frog legs', 'Frogger', 'The Scorpion and the Frog', 'Pepe the Frog', 'The Frog Prince', 'Common frog']
Will
  • 351
  • 4
  • 15

1 Answers1

7

This is due to the default for auto_suggest on summary() being True.

According to the docs, you can change this to False and it will correctly return the summary for frog.

wikipedia.summary("Frog", auto_suggest=False)
#'A frog is any member of a diverse and largely carnivorous group of short-bodied, tailless amphibians composing the order Anura (literally without tail in Ancient Greek)

It seems, for whatever odd reason the the API's suggest() feature is... weird. It would likely be best to keep auto_suggest to False..

wikipedia.suggest("Frog")
#'food'
wikipedia.suggest("Steak")
#'steam'
wikipedia.suggest("Dog")
#'do'
wikipedia.suggest("cat")
#'cats'
wikipedia.suggest("david attenborough")
#None 
PacketLoss
  • 5,561
  • 1
  • 9
  • 27
  • Bizarre API implementation. There's an exact match, but it won't retrieve it. I wonder what their justification is. – jarmod Mar 14 '21 at 23:54
  • @jarmod seems the `suggest()` feature is.. not very good at suggesting? It almost never matches what you're looking for. – PacketLoss Mar 15 '21 at 00:00
  • The suggestion is something you can as a replacement search query if the original search query didn't yield good results. It is not a title and it doesn't make sense to take it over the top search result (which is the exact match for all of these examples). The script should be fixed. – Tgr Mar 16 '21 at 17:31
  • It seems largely unmaintained, with a zillion bug reports about this issue ([e.g.](https://github.com/goldsmith/Wikipedia/issues/227)) and even a [pull request](https://github.com/goldsmith/Wikipedia/pull/131). You might be better off looking for an alternative tool or a better-maintained fork. – Tgr Mar 16 '21 at 17:40