3

I have from a network traffic data, data volume (# of bytes) and # of flows over a week period for origin and destination IP pair. I want to plot distribution, i.e. frequency against rank. I believe that there is a function already provided by R for that. What is it and how to use that function for my scenario.

user744121
  • 467
  • 2
  • 7
  • 17

5 Answers5

3

Check out the zipfR package, and its dedicated website including the following tutorial: The zipfR package for lexical statistics: A tutorial introduction.

chl
  • 27,771
  • 5
  • 51
  • 71
1

This should properly be a comment to hadley's answer, but the original question is looking for:

plot(log10(seq_along(tbl)), log10(unclass(tbl)))
Russ
  • 3,644
  • 1
  • 13
  • 15
1

It hardly seems like you need a special function:

x <- rpois(1000, 10)
tbl <- table(x)
plot(seq_along(tbl), unclass(tbl))

Or are you looking for hist?

hist(x)
hadley
  • 102,019
  • 32
  • 183
  • 245
  • Could you please elaborate this! as I have a file with columns IP pair, number of bytes and number of flows. For this I need to plot origin-destination IP pair by both by data volume and by # of flows (Zipf-type plot) – user744121 May 09 '11 at 16:22
0

There is a Zipf plotting mechanism in the tm (text mining) package.

Zipf_plot(x, type = "l", ...)

IRTFM
  • 258,963
  • 21
  • 364
  • 487
-1

I found out that Zipf plot is just the log-log plot of the frequency of an entity (say 'flows') sorted in descending order.

user744121
  • 467
  • 2
  • 7
  • 17