Data Structures in purely Functional paradigm

Question

I have some background in Algorithms and Data Structures. I've also spent some time writing programs in Object Oriented and Procedural ways(using C, C++, Java, etc.), but Functional way of thinking is quite new for me.

Almost every program uses classic data structures like Array(solid piece of memory), List(chunks of memory, connected by pointers), Set(based on a hash table or a tree-like structure), Map(based on a hash table or a tree-like structure).

I figured that purely functional environment has only 3 types out of these 6 classic data structures: List, Set(based on tree), Map(based on tree). I know, that some functional languages actually have mutable Array and even may be Set and Map(based on hash table), but I'm talking about purely functional approach.

Well, may be the lack of hash-based Set/Map is not very perceptible, but without my good old mutable array I feel quite uncomfortable. Consider an example:

Suppose I have a list of pairs ((1,3) (2,2) (1,4) (2,1) (0,9) ...) where each pair is (knapsack_number, weight_of_item) - basically each such pair is associated with an item, item has to go in particular knapsack(all knapsacks are enumerated) and it has some weight. And based on that I want to get a list of pairs ((0,27) (1,33) (2,18) ...) - where each pair is (knapsack_number, total_weight_of_items_in_that_knapsack).

If I were able to use mutable array, I could easily iterate over my incoming list once and have as a result an array of pairs quite efficiently. But I don't have mutable array in purely functional environment. So, what is the best solution can I come up with ? (The best thing I have in mind so far is to emulate an array using immutable map by having 0, 1, 2, ... as keys, is this the way to go when I need an array in purely functional environment?)

See [Efficiency of purely functional programming](http://stackoverflow.com/q/1990464/791604) for some very related discussion (perhaps even related enough to mark this as a duplicate). That said: in this particular case, even an immutable array is sufficient to match the asymptotics of the destructive algorithm. Though I believe in most languages even immutable arrays -- pure though they are -- are typically provided as a part of the language implementation rather than as a mere library. — Daniel Wagner, Oct 20 '15 at 20:08
If you were to use a mutable array, how would you know its size? And what if the knapsack numbers are sparse? Of course there are solutions, but you can avoid the whole questions by simply choosing a map. — Bergi, Oct 20 '15 at 22:45
@Bergi, I consider the case when I know the total amount of sacks ahead of time and they are enumerated from 0 to N. It isn't actually a contrived example, cause it came up when I tried to write a brute force solution for [Multiple Knapsack Problem] (https://en.wikipedia.org/wiki/Knapsack_problem#Multiple_knapsack_problem) in Scheme. I'd like not to fire a cannon at sparrows if I can, and use an Array where it fits better than a Map. — Yury Sidorenko, Oct 21 '15 at 06:30
For a fixed-size array, you can use a full m-ary tree, where at each level you have m-tuples of lower-level nodes, and the lowest level has the items in your array. — Matt Timmermans, Oct 28 '15 at 01:06
For an array of size N, you will need log_m(N) levels. Replacing an item means allocating a new node at each level, each of which will require km bytes for some k, so the total allocation required for a replacement is km*log_m(N) = log(N)*km/log(m) Obviously, it's going to take O(log(N)) time to update your array. Also, overhead is minimized when m/log(m) is minimized. That happens at m=e, so either 2 or 3 will be best. They're close, but 3 is a bit better, and 4 probably wins in a lot of cases when you take into account a bit of extra per-object overhead. — Matt Timmermans, Oct 28 '15 at 01:06

Data Structures in purely Functional paradigm

0 Answers0