1

guys, I am doing some research on DBLP, and using the repository of Hugh Glaser, RKB-EXPLORER DBLP(rdf/xml). consider this page of a article in dblp:

http://dblp.rkbexplorer.com/id/journals/jvcir/YuanWSZ13

as you can see, the author id of this article is something like this:

http://dblp.rkbexplorer.com/id/people-b3f641eef09c498bdd94087b74854be9-36a6b8e7b69947e5659953aaf7fb802c.

I tried same author name with different articles, and know that the id above details like this:

b3f641eef09c498bdd94087b74854be9: the author name's 32 chartacters encode.(never mind) 36a6b8e7b69947e5659953aaf7fb802c: the article name's 32 encode.

so, it acctually gives the same id to "same name" people, but many people have exactly same name.this is ambiguation. For dblp author disambiguation ,I tried two approaches below:

  1. get the affiliation of each article, then if the same name appeared in two articles with same affiliation. I think this can be sure a same person. but the difficult is the dblp.rkbelporer.com dataset didn't provide enough info about this. and use google search to search article title, can't get enough info too.
  2. get all author's image of each article, and do something like personal image match to check whether same name is same person. but this is also some kind of not feasible too, as the author personal images of articles are too less.

So, any suggestion ? thx very much.

santi
  • 117
  • 3
  • 11
  • 1
    I'm having a hard time following your question. Can you show an example where two different authors are identified by the same IRI? – Joshua Taylor Jan 10 '14 at 15:34
  • @JoshuaTaylor okay, i see. the dblp rdf adds numeric sequence suffix to the same name author. 1. http://dblp.rkbexplorer.com/id/people-b3f641eef09c498bdd94087b74854be9-ff01d396b59535dca5bb477995491222 Fei Wu 2. http://dblp.rkbexplorer.com/id/people-59b11af10058e1252dea1fec37f74ff6-5bdda20f52a2acbfb1410d217e6298ce Fei Wu 0002 That's okay. thankyou buddy. – santi Jan 11 '14 at 05:14
  • @JoshuaTaylor hi, another question, why so many paper authors in dblp.rkbexplorer didn't have organization or university info. as you can see the property "has-affiliation" in this two links: http://dblp.rkbexplorer.com/id/conf/icadl/ZhangZWW05 (no affiliation) http://dblp.rkbexplorer.com/id/people-060c1998bf7eccf54ad2d1fef66ee49b-d1745b550cd13eb21f2bfc6899a1f1f1 (has affiliation MIT) – santi Jan 11 '14 at 06:24
  • @JoshuaTaylor also the orgnazation in rdf dataset even not contain "princeton" or "Harvard" lots of famous university. – santi Jan 11 '14 at 06:28
  • I don't know much about that dataset, aside from what I've browsed since reading this question, so I don't know why some authors have affiliations and some don't. I'd _expect_ that the dataset is just a translation of DBLP data (but I don't know this for sure); does the DBLP contain affiliations that the RDF dataset doesn't? Also, if you've got another question, it's better to ask as a separate question, since not everyone reads the comments, comments can be deleted, etc. – Joshua Taylor Jan 11 '14 at 13:30
  • Typically _authors_ would have affiliations, not articles, no? After all, it's very common for authors from different organizations to collaborate on articles. – Joshua Taylor Jan 11 '14 at 13:31
  • yeah, i think so. the authors maybe from different organizations , and as so far i know, the rdf dataset isn't from the latest version of dblp, thus some info in origin dblp have been lost. Okay, thx again buddy, very pleasure. – santi Jan 14 '14 at 07:05
  • Please mind that the information at dblp.rkbexplorer.com/ is just a snapshot of dblp, with the last update being from 2013-02-27 as it seems. However, RDF/XML data directly from dblp is still weeks away. Information on dblp author identities can be found here: dblp.uni-trier.de/faq/1474783 . – MRA Apr 01 '14 at 12:00
  • Also, coverage on affiliations is inhomogeneous in dblp since that data is only known for some ~5000 records. The reason is that this data is simply not known to dblp. Also note that provided affiliation data might be ambiguous, and that individual authors may have multiple affiliations. – MRA Apr 01 '14 at 12:16

0 Answers0