There are loads of great word and tag clouds available, the most prominent being wordle.net. But I am looking to display something akin to what some folks did for a twitter replay of the 2010 world cup, just not using flash. I'm not too familiar with R, but it seems to be the best tool for generating some statistical decay of font size over time. Is there a Java API (or combination of APIs) that might make this capability easier from the start?
3 Answers
I'm not aware of a good R package for that. There are some functions, like cloud
in the snippets package, and maybe other functions, but nothing compared to http://wordle.net, http://tagcrowd.com/, or Many Eyes. Drew Conway has done some nice stuff with tm
+ ggplot2
; I also played with it a while ago, but this was more of to play with 3D tag cloud (with rgl
) than wordle.
In Python or Processing, there are some ongoing projects detailed on this related question. To my knowledge, Tagxedo looks great but it has no API and it relies on Silverlight.
Pierre Lindenbaum also has some Java code, see his blog post Playing with the Wordle algorithm: a tag cloud of Mesh Terms.
-
Excellent sites. However, these sites lack temporal capability, eg a sense that the weight of the word changes over time (as discrete frames). I mentioned R because of this previous question: [link](http://stackoverflow.com/questions/2961325/plotting-a-word-cloud-by-date-for-a-twitter-search-result-using-r). I tried with Gephi, but the dynamic (temporal) capability is not yet mature enough (or my knowledge isn't, one or the other). – Matt Jun 10 '11 at 15:14
-
I just wanna make fancy and more dynamic word cloud. Please see an example on https://www.wordyup.com/. But I dont know jquery. I work on R. Please share any idea how to make such. – jay_phate Feb 05 '15 at 09:10
It's not great, but there is an open-source project (alas, in PHP) that does word clouds over time. The example uses presidential speeches. http://chir.ag/projects/preztags/

- 2,535
- 1
- 28
- 35
Here is one that I created in Java as part of a larger project for deriving information from unstructured data : https://github.com/regunathb/Sift. The "tagcloud" project has all the required classes for generating a tag cloud and writing it to multiple putput image formats.