According to Freebase, they have 23,407,174 topics. What is the easiest way to get the UI friendly names (essentially the 'text' attribute of the topic JSON, example of a single topic JSON is here) of ALL of these TOPICs? I don't need any other meta information.
Asked
Active
Viewed 1,478 times
1
2 Answers
1
wget -O - http://download.freebase.com/datadumps/latest/freebase-simple-topic-dump.tsv.bz2 | bunzip2 | cut -f 2 > freebase-topic-names.txt
although you probably want the Freebase IDs as well so that you know what the names refer to:
wget -O - http://download.freebase.com/datadumps/latest/freebase-simple-topic-dump.tsv.bz2 | bunzip2 | cut -f 1,2
Two additional bits of postprocessing are needed:
- Tabs are escaped as \t
- The string \N represents a null (non-existent) name

Tom Morris
- 10,490
- 32
- 53
0
Take a look at the Simple Topic Dump that we provide. It's over a GB of compressed data but its still faster to download than trying to get all the names through the API.

Shawn Simister
- 4,613
- 1
- 26
- 31