1

I have a txt file that looks like this:

0.065998       81   
0.319601      81   
0.539613      81  
0.768445      81  
1.671893      81  
1.785064      81  
1.881242      954  
1.921503      193  
1.921605      188  
1.943166      81  
2.122283      63  
2.127669      83  
2.444705      81  

The first column is the packet arrival and second packet size in bytes.

I need to get the average value of bytes in each second. For example in the first second I have only packets with value 81 so the average bitrate is 81*8= 648bit/s. Then I should plot a graph x axis time in seconds, y axis average bitrate in each second.

So far I have only managed to upload my data as arrays:

import numpy as np

d = np.genfromtxt('data.txt')

x = (d[:,0])  
y = (d[:,1 ])

print x  
print(y*8)

I'm new to Python, so any help where to start would be much appreciated!

Here is the result script:

import matplotlib.pyplot as plt  
import numpy as np  
x, y = np.loadtxt('data.txt', unpack=True)  
bins = np.arange(60+1)  
totals, edges = np.histogram(x, weights=y, bins=bins)  
counts, edges = np.histogram(x, bins=bins)  

print counts  
print totals*0.008/counts  

plt.plot(totals*0.008/counts, 'r')  
plt.xlabel('time, s')  
plt.ylabel('kbit/s')  
plt.grid(True)  
plt.xlim(0.0, 60.0)  
plt.show()      

The script reads the .txt file which contains packet size(bytes) and arrival time, and plots the average bitrate/s during a time period. Used to monitor server incoming/outgoing traffic!

  • 2
    In the first second, you only have received 81*4 bits, correct? –  Feb 01 '13 at 18:06
  • 1
    Could you specify your definition of "average bitrate". Do want the average since the beginning of the data? Or a more recent estimate of the average? –  Feb 01 '13 at 18:13
  • Yes in the first second I have received 4 packets 81bytes each. I want to get the average packet size in bits in each second. Lets say we received 10 packets each different size in the first second, so I need to get the average value in bits for those 10packets in the first second... and so on for every next second. – user2033409 Feb 01 '13 at 18:23

3 Answers3

5

Your data is already sorted by time so I might just use itertools.groupby for this one:

from itertools import groupby
with open('data.txt') as d:
     data = ([float(x) for x in line.split()] for line in d)
     for i_time,packet_info in groupby(data,key=lambda x:int(x[0])):
         print i_time, sum(x[1] for x in packet_info)

output is:

0 324.0
1 1578.0
2 227.0
mgilson
  • 300,191
  • 65
  • 633
  • 696
  • Good answer although OP wanted average. – sotapme Feb 01 '13 at 19:36
  • @sotapme -- I'm not exactly sure what OP meant by "average bytes in each second". – mgilson Feb 02 '13 at 01:57
  • You're right it does read confusingly - I'd presumed he had 4 samples in 1 second of 81, which was an average of 81 ; then times that by 8 for bit rate. Whatever - your answer has put got them 90% of the way there and the rest is mere detail. – sotapme Feb 02 '13 at 10:46
4

If you want to use numpy, you can use numpy.histogram:

>>> import numpy as np
>>> x, y = np.loadtxt('data.txt', unpack=True)
>>> bins = np.arange(10+1)
>>> totals, edges = np.histogram(x, weights=y, bins=bins)
>>> totals
array([  324.,  1578.,   227.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.])

This gives the total in each bin, and you could divide by the width of the bin to get an approximate instantaneous rate:

>>> totals/np.diff(bins)
array([  324.,  1578.,   227.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.])

(Okay, since the bin widths were all one, that isn't very interesting.)

[update]

I'm not sure I understand your follow-up comment that you need the average packet size in each second -- I don't see that mentioned anywhere in your question, but I'm notorious at missing the obvious.. :-/ In any case, if you want the number of packets in a time bin, then you simply don't need to set the weights (default is 1):

>>> counts, edges = np.histogram(x, bins=bins)
>>> counts
array([4, 6, 3, 0, 0, 0, 0, 0, 0, 0])

where counts is the number of packets which arrived in each bin.

DSM
  • 342,061
  • 65
  • 592
  • 494
  • Thank you for your answer! This almost does the job, from this I get the total bits transferred per second in all packages that arrived in that second. But I need to get the average packet size in each second. I need to divide the result that I get from totals with len(y) in that second, since I get irregular number of packets each second. How can I calculate len(y) in each second? – user2033409 Feb 02 '13 at 19:09
  • Many thanks! I have edited the code in the first post if you are interested to see what I meant by average. :) – user2033409 Feb 02 '13 at 21:14
0

Since the arrival times are irregular, I recommend quantizing them into integer numbers of seconds, and then aggregating total bytes for all arrivals for a given second. With this done, plotting and other analysis gets a lot easier.

Randall Cook
  • 6,728
  • 6
  • 33
  • 68