3

I have a string like "0189", for which I need to generate all subsequences, but the ordering of the individual characters must be kept, i.e, here 9 should not come before 0, 1 or 8. Ex: 0, 018, 01, 09, 0189, 18, 19, 019, etc.

Another example is "10292" for which subsequences would be: 1, 10, 02, 02, 09, 29, 92, etc. As you might have noticed '02' two times, since '2' comes twice in the given string. But again things like: 21, 01, 91 are invalid as order is to be maintained.

Any algorithm or psuedo code, which could be implemented in C/C++ would be appreciated!

fancyPants
  • 50,732
  • 33
  • 89
  • 96
kunal18
  • 1,935
  • 5
  • 33
  • 58
  • Which should it be: C or C++? – Konrad Rudolph Aug 02 '12 at 08:57
  • 9
    A kitten dies every time someone says "C/C++". – Philip Aug 02 '12 at 08:57
  • Since he's asking for algorithms or pseudocode 'C/C++' meaning C or C++ is reasonable enough. – john Aug 02 '12 at 08:58
  • sort order is unclear in OP, 018 before 01? – bph Aug 02 '12 at 08:58
  • 1
    Sort order refers to the individual characters I think, i.e. '01' is ok but '10' is not. – john Aug 02 '12 at 08:59
  • It's just enumerating through the power set isn't it. For a string of size four you have 2^4 (i.e. 16) possibilities. Simple recursive function should do it. – john Aug 02 '12 at 09:02
  • John is right! 01 is OK but not 10. "Sort order refers to individual characters" – kunal18 Aug 02 '12 at 09:02
  • Actually its not sorting. If given number is 0189, then in any subsequence '9' cannot come before '0', '1' and '8'. Similarly, 8 cannot come before 0 and 1. but 89 is fine. – kunal18 Aug 02 '12 at 09:04
  • http://stackoverflow.com/questions/9252680/generating-the-power-set-of-a-list possibly related. – ecatmur Aug 02 '12 at 09:07
  • @John the given number could also be '011889', are you sure powerset algo will do? And by the way, what you have to do with my name :)? – kunal18 Aug 02 '12 at 09:07
  • @StalinSubramaniam: I tried to clarify your question, hope I got it right, if not, please undo my edit. – Doc Brown Aug 02 '12 at 09:08
  • @StalinSubramaniam: Give that input a power set algorithm would result in duplicates. So maybe remove them afterwards? – john Aug 02 '12 at 09:10
  • @John, well I need to count those duplicates too. Thanx :)! – kunal18 Aug 02 '12 at 09:18
  • @StalinSubramaniam, is the input guaranteed to already be sorted? You do allow repeated characters in your input. Both this points should be clarified in the question. And also, do you want to allow duplicates in the output? – Aaron McDaid Aug 02 '12 at 10:28
  • @AaronMcDaid: I have edit my question to include one more example. This should clarify your doubts! – kunal18 Aug 02 '12 at 11:04

4 Answers4

8

Try a recursive approach:

  • the set of subsequences can be split into the ones containing the first character and the ones not containing it
  • the ones containing the first character are build by appending that character to the subsequences which don't contain it (+ the subsequence which contains only the first character itself)
Doc Brown
  • 19,739
  • 7
  • 52
  • 88
6

I'd recommend using the natural correspondence between the power set of a sequence and the set of binary numbers from 0 to 2^n - 1, where n is the length of the sequence.

In your case, n is 4, so consider 0 = 0000 .. 15 = 1111; where there is a 1 in the binary expression include the corresponding item from the sequence. To implement this you'll need bitshift and binary operations:

for (int i = 0; i < (1 << n); ++i) {
    std::string item;
    for (j = 0; j < n; ++j) {
        if (i & (1 << j)) {
            item += sequence[j];
        }
    }
    result.push_back(item);
}

Also consider how you'd handle sequences longer than can be covered by an int (hint: consider overflow and arithmetic carry).

ecatmur
  • 152,476
  • 27
  • 293
  • 366
  • this will generate all subsequences. What if one needs to generate subsequences UPTO specific length(say, 4=> subseq of length 4,3,2,1). – kunal18 Aug 03 '12 at 20:50
  • In that case a recursive (or recursion-based) solution would be appropriate. – ecatmur Aug 05 '12 at 20:03
1

In Python:

In [29]: def subseq(s): return ' '.join((' '.join(''.join(x) for x in combs(s,n)) for n in range(1, len(s)+1)))

In [30]: subseq("0189")
Out[30]: '0 1 8 9 01 08 09 18 19 89 018 019 089 189 0189'

In [31]: subseq("10292")
Out[31]: '1 0 2 9 2 10 12 19 12 02 09 02 29 22 92 102 109 102 129 122 192 029 022 092 292 1029 1022 1092 1292 0292 10292'

In [32]: 
Paddy3118
  • 4,704
  • 27
  • 38
0
__author__ = 'Robert'
from itertools import combinations

g = combinations(range(4), r=2)
print(list(g)) #[(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)]

def solve(string_):
    n = len(string_)
    for repeat in range(1, len(string_) + 1):
        combos = combinations(range(len(string_)), r=repeat)
        for combo in combos:
            sub_string = "".join(string_[i] for i in combo)
            yield sub_string

print(list(solve('0189'))) #['0', '1', '8', '9', '01', '08', '09', '18', '19', '89', '018', '019', '089', '189']


#using recursion

def solve2(string_, i):
    if i >= len(string_):
        return [""] #no sub_strings beyond length of string_
    character_i = string_[i]
    all_sub_strings = solve2(string_, i + 1)
    all_sub_strings += [character_i + sub_string for sub_string in all_sub_strings]
    return all_sub_strings


print(solve2('0189', 0)) #['', '9', '8', '89', '1', '19', '18', '189', '0', '09', '08', '089', '01', '019', '018', '0189']
Rusty Rob
  • 16,489
  • 8
  • 100
  • 116