3

I'm currently writing code for an assignment that deals with cities and bridges. I have to print the cities and bridges out in their respected districts such as:

//unorganized inputs from user given the # of "paths" we need
4       // the # of paths
1 2 5  // 1 = city , 2 = city, 5 = bridge length
6 7 5  // 6 = city , 7 = city, 5 = bridge length
2 3 7  // 2 = city , 3 = city, 7 = bridge length
6 9 7  // 6 = city , 9 = city, 7 = bridge length

After run through program, it will be sorted as:

first district
1 2 5
2 3 7

2nd district
6 7 5
6 9 7

Now, I'll be reading these inputs through cin. I want to store all the possible paths such as 1 2 5 into an array and then sort and organize them through the program. The problem is that I may have over 500,000 paths from the user. I want to create 500k dynamic arrays. Will this cause serious problems in terms of memory?

I have looked at other possible ways of solving this such as kruskal's algorithm and disjoint sets(I think is the most useful). I'm having a very hard time understanding the coding of disjoint sets, I figured I try a way I'm more familiar with.

Any help with where to store the values and compare and organize them would be great. Links to places where I read info on this would help. I've read a lot over the past few days. Hasn't helped much.

To sum it all up, my questions are:

  • Will 500k dynamic arrays cause serious problems in terms of memory?
  • Where to store the values and compare and organize them given the paths?
Christian Rau
  • 45,360
  • 10
  • 108
  • 185
Chris
  • 57
  • 1
  • 7
  • 1
    "The problem is that I may have over 500,000 paths from the user.", do you imply that you want the user to input 500k paths through the console? – SingerOfTheFall Nov 12 '12 at 06:17
  • this is will probably be through a file. – Chris Nov 12 '12 at 06:22
  • @SingerOfTheFall: It is most likely that the tutor will use something like `cat problem_instance1 | user_written_program`. – Zeta Nov 12 '12 at 06:24
  • @Zeta, I hoped for it, but thought I'd better ask to be sure ;) – SingerOfTheFall Nov 12 '12 at 06:30
  • Why use dynamic arrays when you have `std::vector`? – Some programmer dude Nov 12 '12 at 07:03
  • not allowed to use any STL. forgot to mention it up there – Chris Nov 12 '12 at 07:04
  • Am I missing something here or do your 500k records all have the same 3 element structure? This would of course eliminate the need for any dynamic arrays for those in contrast to a simple 3-member struct. – Christian Rau Nov 12 '12 at 08:27
  • @ChristianRau Your correct. The inputs will only be 3 numbers in length. I've used the struct, but I think my lack of knowledge of OOP and inheritance is limiting my ability to implement it effectively. Any resources you recommend? – Chris Nov 12 '12 at 08:45

3 Answers3

1

Will 500k dynamic arrays cause serious problems in terms of memory?

No problem there, assuming each is merely an array of 3 ints. Typically, you would avoid doing this as separate allocations because it is excessive -- it will be a bit slow and the bookkeeping required will consume a fair amount of memory too. There's a better approach:

Where to store the values and compare and organize them given the paths?

I'd start with a struct/class which holds those 3 fields, then use a std::vector of those. This will store all your values as one contiguous allocation. Very fast to create, search and allocate in comparison.

justin
  • 104,054
  • 14
  • 179
  • 226
  • Oops, forgot to mention, I'm not allowed to use anything in STL. Everything has to be coded by myself. – Chris Nov 12 '12 at 06:29
  • @Chris then you can use an array of those 500k records (or however many you will need) to accomplish the equivalent of `vector`. just avoid allocating it on the stack. – justin Nov 12 '12 at 06:35
1

In general, assuming that you have 2 gigs of memory for your app, 500K records of 12 bytes (assuming you use 32bits for your values) will not be a problem.
If you wish to reduce your data set size, you can, for example, use data format like:

struct {
   unsigned short city_a;
   unsigned short city_b; 
   char length;
}


Look at the size of the city set (number of cities), and maximum length between two cities.
Also, things like indexing city pairs (A-B becomes Pair_ID) can reduce the data set as well.

Michael Shmalko
  • 710
  • 4
  • 16
1

This may not be directly related to your question, but I think what you are trying to accomplish is this - http://en.wikipedia.org/wiki/Connected_component_(graph_theory). And if you model your graph as an adjacency matrix, you need not allocate 500k dynamic arrays . Consider the following format for storing your data :

int city_map [MAX_NO_OF_CITIES][MAX_NO_OF_CITIES];

city_map[i][j] = length_of_brigde_connecting_city_i_to_j;

This way storing 500,000 entries will only take a little more than 1MB of memory.

adi
  • 580
  • 2
  • 12