Questions tagged [wikipedia]

Consider the tags wikipedia-api (or the more general mediawiki-api) and mediawiki. Questions should be related to programming.

Source: Wikipedia

Wikipedia, the free encyclopedia, is a free web-based encyclopedia project written and maintained by people all over the world and is one of the most visited websites in the world(1,2). Consider also the tags,

  • if your question has to do with the use of the API on Wikipedia
  • if your question has to do with the use of the API on general Mediawiki sites
  • if your question has to do with the MediaWiki software.
1766 questions
21
votes
6 answers

How to group wikipedia categories in python?

For each concept of my dataset I have stored the corresponding wikipedia categories. For example, consider the following 5 concepts and their corresponding wikipedia categories. hypertriglyceridemia: ['Category:Lipid metabolism disorders',…
EmJ
  • 4,398
  • 9
  • 44
  • 105
21
votes
7 answers

How can I get a Wikipedia article's text using Python 3 with Beautiful Soup?

I have this script made in Python 3: response = simple_get("https://en.wikipedia.org/wiki/Mathematics") result = {} result["url"] = url if response is not None: html = BeautifulSoup(response, 'html.parser') title =…
user10798111
19
votes
2 answers

Am I allowed to use Wikipedia content?

I'm always confused when reading licenses .... I want to implement (to be honest implementation is already done) a commercial iPad app which makes use of content from wikipedia. Am I allowed to Embed hardcoded links that point to wikipedia articles…
Kai Huppmann
  • 10,705
  • 6
  • 47
  • 78
19
votes
2 answers

Wikipedia Category Hierarchy from dumps

Using Wikipedia's dumps I want to build a hierarchy for its categories. I have downloaded the main dump (enwiki-latest-pages-articles) and the category SQL dump (enwiki-latest-category). But I can't find the hierarchy information. For example, the…
fersarr
  • 3,399
  • 3
  • 28
  • 35
18
votes
2 answers

Summarizing a Wikipedia Article

I find myself having to learn new things all the time. I've been trying to think of ways I could expedite the process of learning new subjects. I thought it might be neat if I could write a program to parse a wikipedia article and remove…
Jesse Aldridge
  • 7,991
  • 9
  • 48
  • 75
18
votes
3 answers

How do I get all articles about people from Wikipedia?

What would be the easiest way to get all articles about people from Wikipedia? I know I can download a dump of all the pages, but then how do I filter those and get only the ones about people? I need as many as I can get (preferably more than a…
Johnny
  • 7,073
  • 9
  • 46
  • 72
18
votes
2 answers

Wikipedia API - get random page(s)

I'm trying to get a JSON result with a set of random pages from Wikipedia, including their titles, content and images. I've played around with their API sandbox, and so far the best I've got is…
Petter
  • 773
  • 2
  • 9
  • 19
17
votes
5 answers

Where to find "bug free" html to wiki converter

While googling for it.I've stumbled upon html2wiki that seems to do the job(will try after done posting the Q up). But, other than that, there are many other choices popped out during the query session. An word on which app to choose would be…
Daniel
  • 631
  • 10
  • 18
17
votes
1 answer

Freebase / DBpedia / wikidata.org -- differences

I'm looking to enhance several "objects" in my application with human-readable data. To that end, I've seen Freebase, DBpedia and wikidata.org, and am currently working with Freebase. I can't help but wonder, though, what I am missing. So: what's…
Nitzan Shaked
  • 13,460
  • 5
  • 45
  • 54
16
votes
2 answers

Using a Word2Vec model pre-trained on wikipedia

I need to use gensim to get vector representations of words, and I figure the best thing to use would be a word2vec module that's pre-trained on the english wikipedia corpus. Does anyone know where to download it, how to install it, and how to use…
Boris
  • 716
  • 1
  • 4
  • 25
16
votes
4 answers

Blacklist IP database

Is there an open database of blacklisted IP for the Web? With a lot of public web proxy you know... such the blacklist used by the Global blocking of Wikipedia.
T5i
  • 1,470
  • 1
  • 18
  • 34
15
votes
1 answer

Sparql Query to get all the possible movies available from dbpedia

To get all the possible film name, I used sparql query: PREFIX rdfs: PREFIX rdfs: SELECT DISTINCT ?film_title ?film_abstract WHERE { ?film_title rdf:type…
Shruts_me
  • 843
  • 2
  • 12
  • 24
15
votes
2 answers

How to add a link in MediaWiki VisualEditor Toolbar?

I`m trying to insert a custom link to a special page in VisualEditor toolbar. See the image below. See Image I googled a lot but without success. Someone please give a path...
ricardogobbo
  • 1,690
  • 3
  • 19
  • 41
15
votes
1 answer

Obtaining static HTML files from Wikipedia XML dump

I would like to be able to obtain relatively up-to-date static HTML files from the enormous (even when compressed) English Wikipedia XML dump file enwiki-latest-pages-articles.xml.bz2 I downloaded from the WikiMedia dump page. There seem to be quite…
Brian Schmitz
  • 1,023
  • 1
  • 10
  • 19
14
votes
1 answer

How to get wikipedia page in multi languages?

How can I get the same wikipedia page in another language. For example I want to get this page in Japanese, http://en.wikipedia.org/wiki/Cloud result is http://ja.wikipedia.org/wiki/雲 or only the title 雲 Is it possible to use wikipedia API or any…
bbnn
  • 3,505
  • 10
  • 50
  • 68