Would someone please help me to understand how I might inject into my program a query to this webpage?
There are two parameters that need to be set, i.e.
"Site:", is where you enter the language and site code.
&
"Page:", you must put in the exact title of the page as it appears on the connected site.
The URL's always look like this:
https://www.wikidata.org/wiki/Special:ItemByTitle?site=en&page=Mikhail+Bakunin&submit=Search
https://www.wikidata.org/wiki/Special:ItemByTitle?site=en&page=Thomas+Edward+Lawrence&submit=Search
and the language is always English, so you see, it's just:
https://www.wikidata.org/wiki/Special:ItemByTitle?site=en&page=
Blah+
Blah&submit=Search
The objective of querying that page is to retrieve the ID value associated with the page, so for Mikhail Bakunin
it's Q27645
and for T. E. Lawrence
it's Q170596
It becomes part of the URL once the page is reached:
https://www.wikidata.org/w/index.php?title=Q170596&site=en&page=Thomas+Edward+Lawrence&submit=Search
But also maybe I could strip it from the page, using beautifulSoup or soemthng?(that's a guess)
The program needs to be generalizable, which is to say, that the name of the entity we're searching for is variable, it will change in the program, so that needs to be taken in account.
I guess using python or php or something would not be a crime against humanity if it's easier, though I prefer java.
update:
import java.net.*;
import java.io.*;
public class URLConnectionReader
{
public static void main(String[] args) throws Exception
{
URL site = new URL("https://www.wikidata.org/wiki/Special:ItemByTitle?site=en&page=Mikhail+Bakunin&submit=Search");
URLConnection yc = site.openConnection();
BufferedReader in = new BufferedReader(
new InputStreamReader(
yc.getInputStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
System.out.println(inputLine);
in.close();
}
}
this works sort of, but the result is quite messy.
I guess I could grab it out of this thing:
<!-- wikibase-toolbar --><span class="wikibase-toolbar-container"><span class="wikibase-toolbar-item wikibase-toolbar ">[<span class="wikibase-toolbar-item wikibase-toolbar-button wikibase-toolbar-button-edit"><a href="/wiki/Special:SetSiteLink/Q27645">edit</a></span>]</span></span>
but how?