6

I am working on a project for a university housing client where I need to model the usage patterns of students living on campus. There are obviously many variables at play here, I am keen to understand how they would impact such a model.

There are many parallels that can be drawn between this and a normal office/dc scenario - however I believe University students in a residential scenario will not fit any corporate models (due to online gaming, filesharing, skype etc...). In my project the design will be a hub/spoke. The data center will have a large Internet trunk feeding into various firewalls, proxies and servers managing user access. There are WAN links out to each of the student sites. I need to be quite accurate in modelling link size and usage patterns on each of the links.

For example as a baseline I have assumed that the Internet pipe will need to be at least 200Mbps at the data center. For the WAN links I have a mix of 50M, 100M, 200M. Are there any models I can use to test my baseline to see what sort of performance can be expected by the students... eg. If Skype is allowed on the network, will my model stand up if the load is at 60% across the network.

I know this is a very open ended question. There is not going to be a correct answer (unless someone has a model they built for this very scenario) I am more interested in the discussion that might come from it as there are so many things that need to be factored in. Would love to hear some opinions.

voretaq7
  • 79,879
  • 17
  • 130
  • 214
cstrat
  • 169
  • 3
  • 3
    As interesting as this question may be, I don't think it belongs on ServerFault. That said, having been a student at university, many, many years ago, I will suggest that the students' usage will grow to consume the available bandwidth, regardless of the physical/logical design. – jscott Feb 12 '13 at 03:23
  • Yeah I wasn't sure which stackexchange site this would belong to... serverfault was the only one I could find that was network related?? If there is a more appropriate place to list this question - please point me there :) – cstrat Feb 12 '13 at 03:26
  • Oh and I agree with your comment about students using what they can. I just need to be able to model what the expected experience will be. ie. will it feel like they are on a 56k connection, will it feel like ADSL1, will it feel like ADSL2, etc... – cstrat Feb 12 '13 at 03:29
  • Is this for a university class project? My suggestion is that if you work for the university, you might contact the department that handles the operation research area of study, a prof might be able to uncover a mathematical model or be able to maybe point you to a source. – mdpc Feb 12 '13 at 03:30
  • I worked at a medium sized university and we shaped traffic heavily, so that torrent traffic was the lowest priority and all lab/classroom traffic had precedence over dorm traffic. For ~3000 students, we had a 500Mb Internet connection that was constantly saturated. If this is meant to model a large university, you'll want multiple 1Gb pipes or even a 10Gb connection. – MDMarra Feb 12 '13 at 03:31
  • 1
    This isn't for the actual university - this is for university housing (separate business outside of the uni). This is for a customer of mine who has asked we investigate a solution for them. Traditionally customers come to me with a fixed requirement for bandwidth etc... this being open ended is a little more challenging. – cstrat Feb 12 '13 at 03:36
  • Unfortunately in Australia we don't have the luxury of 1G or 10G pipes. They are available - but very, very expensive. – cstrat Feb 12 '13 at 03:48
  • @strat Being in Australia you have *different* problems to consider (bandwidth delay product becomes VERY important, VERY fast) - With the exception of resources located in Australia you can probably get away with a bit more oversubscription than we can in the US, as long as you have good traffic shaping and prioritize ACKs. It will be harder for you to reach true link saturation because long round-trip times will artificially limit link bandwidth to some extent. – voretaq7 Feb 12 '13 at 03:52
  • @voretaq7 Thanks to CDNs this is not as much of an issue as you think, we are just unfortunate enough not to have the economies of scale to have Gbps links all over the place. Thanks for fixing up the question. – cstrat Feb 12 '13 at 19:42

2 Answers2

4

I don't have a model for this utilisation, but I did manage a University halls network back in 2005.
We had a central hub-and-spoke topology, with 1Gbit/s inbound from cable & wireless. We broke that down into 100Mbit allocations, and piped those out to the halls over single mode fibre.
At the access level, we had a metric boatload of Cisco 4006 chassis switches, each with as many 48 port 10/100 line cards as we could fit in.

All ports had a max speed of 10Mbit, half-duplex (no idea why half-duplex, but it "had always been that way"). There was also MAC address port security, and a complex student signup procedure that meant we had to configure the port security on their port from their MAC when they register. This was supposed to be some protection against students putting a switch in their room. It didn't work.

Lessons I learnt:

  • If you can imagine students might do it, they are doing it. (This pretty much covers all types of VoIP, Gaming, Pornography)

  • If you think you've got good firewalls to block P2P traffic, you haven't. (DC++ was the bane of our existence at the time, it wasn't so much people sharing and seeding out to the internet, but inside the LAN).

Other thoughts:

Testing

Consider contacting Spirent as they make a bunch of traffic generators / network tester hardware which could be invaluable in simulating / emulating 16,000 horny students.

Caching

Consider putting a transparent proxy in between the main feed out of the halls and the external connection to the internet. I guess you'll want 10-15TB of cache space, and using something like a cluster of Squid proxies, you should be able to massively limit the amount of internet traffic. It's something I do at events sometimes, especially when the bandwidth is limited. So much of what people browse for is cacheable, and you don't need to re-request it every time.

Clever buggers

No matter what restrictions you place on the speed, the amount of QoS, the level of VLANs, you'll always get a few bright spark students who try to circumvent the network. Hire them. (That's how I got a job working for hallsnet!)

Tom O'Connor
  • 27,480
  • 10
  • 73
  • 148
  • 1
    ...Is there actually a "Horny college student" button on the simulator? PLEASE tell me that's an actual traffic profile it can run :) – voretaq7 Feb 12 '13 at 16:20
  • Thanks Tom - this is a great answer. It sounds like there really isn't going to be anything I can go off to set my baseline. Whatever we build, it will be stretched :) – cstrat Feb 12 '13 at 19:39
  • It genuinely helps to seek out some CS students and get them in as "interns", and then you have some "eyes on the inside". – Tom O'Connor Feb 12 '13 at 20:11
3

In order to build your model you need observations from your environment. The best place to get them would be from your network's current traffic. If I were in your place I would be trying to get Netflow data from your routers for the past year (if possible), or at least a full semester.

You can determine types of traffic using flow-tools (and optionally JKFlow if you want pretty pictures).

Armed with that information you now know (a) What kind of traffic you're producing / consuming, and (b) How much of each type of traffic you're generating. You can combine this information with campus population data (number of students, faculty, staff) to figure out, roughly, how much traffic a person produces, and work out an equation for the average student/professor/staffer.


How detailed you make the model is up to you, and semi-dependent on your network architecture. For example, if your dorms are contained to a specific subnet you can model dorm traffic separately.

Going further, you can model specific dorms, and with the help of university administration telling you how many students in each dorm are in a given major, even correlate that data to a limited extent.


The Netflow traffic data is also a very useful monitoring tool - If you aren't already collecting it, you should be. It will be interesting (at a minimum), and helpful (when stuff goes wonky on the network and you need to figure out why).

voretaq7
  • 79,879
  • 17
  • 130
  • 214
  • Development of the actual model is, unfortunately, left as an exercise for the reader -- the variables of interest (and the constants derived from the traffic data) will be intensely site-specific. You should be charging your customer a small fortune for this kind of predictive analysis. Some references of interest: http://www.cs.cornell.edu/home/kleinber/networks-book/networks-book-ch08.pdf ; http://www.mathworks.com/communications/traffic-modeling-network-performance.html (if you're a MATLAB guy) – voretaq7 Feb 12 '13 at 03:44
  • Thank you for the post, I did think algorithm was not the correct term - but if you replace 'algorithm' with 'model' in the title it reads funny... Unfortunately there is not current solution in place that we can learn from. There is an incumbent service provider - but they don't provide this same service, what they provide is not going to be comparable in anyway that will help. – cstrat Feb 12 '13 at 03:51
  • @strat They have to have *some* kind of traffic log (the granularity may be poor, but at the very least they have to know total bandwidth utilization versus total link capacity - otherwise they can't bill properly). You can start with a dirty model that's just raw bandwidth with monthly granularity, and assuming you get the contract to replace them you can implement more granular reporting and a better model. I did this in a past job for capacity planning, and it's very much a black art. Even with the model there was a lot of manual adjusting based on experience/intuition. – voretaq7 Feb 12 '13 at 03:54
  • They provide individual services to students to elect to pay for the service, they are also only providing a wireless solution which doesn't deliver the level of service we are looking to bring. Since we are looking to aggregate services, I need to know how the model will stand up with many users. I know where you are heading with the suggestion - and I have already put some questions out to people regarding that particular demographic and internet usage. – cstrat Feb 12 '13 at 04:03