0

I have asked this question before here!. At that time, I was concerned on getting the output using the Google-Api which works just fine.

Actually, the problem with that is running into timeouts and more importantly, querying a web-based API. I would like to do it offline using the Freebase data-dumps. Is there any easy way to go about it?

Thanks

Community
  • 1
  • 1
Knight
  • 223
  • 2
  • 10

1 Answers1

1
zegrep $'\tns:type\.object\.name\t.*Bush.*' freebase-rdf-<date>.gz | cut -f 1

will give you a list of all MIDs for topics which contain the string "Bush" (from your previous example) in their name.

Extend the regex as needed to include things like aliases, fancier name matching, etc.

Tom Morris
  • 10,490
  • 32
  • 53
  • Thanks @Tom. This works but is really slow. I can imagine 40M entities and searching between them in this way can be hard. – Knight May 31 '13 at 19:30
  • I don't know whether I should ask this question in a comment. But here goes: I use the Google Topic Api to extract information about a topic/entity like this: `code`https://www.googleapis.com/freebase/v1/topic//m/09937. Is it possible to get the same output from the dumps? – Knight May 31 '13 at 19:36