uniform distribution of a set of characters

Question

There is a set of characters that needs to be uniformly distributed in an array. For example,

a - 1

b - 2

c - 3

d - 4

In this case there is one a, two b, three c and four d with a total of 10 characters.

Now I need to distribute it in an array of size 10 so that all of them are evenly distributed. they don't have to be exactly uniformly distributed, anything close will do.

For ex. this is a valid sequence.

d c d b c d a d c b

Cool. Do you have a question? Or at the very least tell what programming language you're working with? — JJJ, Dec 16 '15 at 21:45
I wrote a blog entry about something similar a while back. See [Evenly distributing items in a list](http://blog.mischel.com/2015/03/26/evenly-distributing-items-in-a-list/). The code is in C#, but the concepts should translate well. See also http://cs.stackexchange.com/questions/29709/algorithm-to-distribute-items-evenly/40773#40773 — Jim Mischel, Dec 16 '15 at 22:02
I need an algorithm which when given a set of characters outputs the evenly distributed sequence. — mohit, Dec 16 '15 at 22:04

score 1 · Accepted Answer · answered Dec 16 '15 at 22:20

1

You could use something similar to the bresenham algorithm to track the error between the ideal spacing and the last spacing for each component:

vals = ['a','b','c','d']
cts  = [1,2,3,4]

sz = sum(cts)
spacing = [float(sz)/(ct+1) for ct in cts]
err = [s for s in spacing]
a=[]
for i in range(sz):
    err = [e-1 for e in err]
    m = min(err)
    i = err.index(m)
    a.append(vals[i])
    err[i]+=spacing[i]
print a

yeilds: ['d', 'c', 'b', 'd', 'a', 'c', 'd', 'b', 'c', 'd']

answered Dec 16 '15 at 22:20

AShelly

34,686
15
91
152

Interesting that you mentioned this. I've been toying with the idea of using the Bresenham approach for this, as it might be simpler and faster than the heap-based approach I used. – Jim Mischel Dec 17 '15 at 16:25
Although the python implementation here is not the most efficient, the algorithm is super simple, and should run in O(MxN) where M is the number of unique characters, and N is the array size. – AShelly Dec 17 '15 at 19:44

Mooing Duck · Answer 2 · 2015-12-17T18:09:54.287

0

First, try to guess where each letter instance would be, only considering one letter at a time. If there's 10 total, and 3as, try to place the As at index 0, 3, and 7. Calculate these estimated indecies for each letter, and put them in a ordered multiset.

std::multimap<unsigned,char> set;
const unsigned totalcount = ... //for your example this would be 10
for (const auto& letterpair : letters) {
    unsigned lettercount = letterpair.second; //for c this would be 3
    for(unsigned i=0; i<lettercount; ++i) {
        unsigned targetIdx = (2*i*totalcount+1)/lettercount;
        set.insert(std::make_pair(targetIdx, letterpair.first));
    }
}

Then we can simply iterate over the set in order and place each thing in a single index.

std::vector<char> result;
for(const auto& p : set)
    result.push_back(p.second); //insert the letters in order

It's not perfect, but it works pretty darn well considering the speed and simplicity.

For your inputs, it results in bcdadcbdcd: http://coliru.stacked-crooked.com/a/1f83ae4518b7c6ca

edited Dec 17 '15 at 18:09

answered Dec 16 '15 at 21:57

Mooing Duck

64,318
19
100
158

The guessing game becomes seriously difficult as the number of items increases. See http://blog.mischel.com/2015/03/26/evenly-distributing-items-in-a-list/ for my solution to the problem. – Jim Mischel Dec 16 '15 at 22:04
@JimMischel: at least for C++ `std::multiset`,I'm pretty sure the algorithm I describe above comes up with the same results as your C# code. They work on the same basic concepts, it's just the approaches are 100% different. Mine takes more memory, is probably a touch slower, but requires less code I think. – Mooing Duck Dec 16 '15 at 23:24
So this in effect builds lists of letters at particular indices, and then outputs those lists, in order, to the result? – Jim Mischel Dec 17 '15 at 16:31
@JimMischel: Hm, not quite as effective as I'd imagined: http://coliru.stacked-crooked.com/a/46292ffb67efd327. With a slight tweak to use the center of each subrange, it works much better: http://coliru.stacked-crooked.com/a/1f83ae4518b7c6ca – Mooing Duck Dec 17 '15 at 18:08

uniform distribution of a set of characters

2 Answers2

Linked