0

What should happen with the final exclusive scan value in a stream compaction algorithm?

This is an example to pick out all the 'A' characters.

Sequence A:

Input:       A B B A A B B A
Selection:   1 0 0 1 1 0 0 1
Scan:        0 1 1 1 2 3 3 3

0 - A
1 - A
2 - A
3 - A

Sequence B (same except the last value):

Input:       A B B A A B B B
Selection:   1 0 0 1 1 0 0 0
Scan:        0 1 1 1 2 3 3 3

0 - A
1 - A
2 - A
3 - B

Clearly the second example gives the wrong final result based on doing a naive loop through the scan values writing into these addresses.

What am I missing here?

Update:

As I understand the scan algorithm, I would do the equivalent of the following:

for (int i = 0; i < scan.length(); i++)
{
    result[scan[i]] = input[i];
}

In parallel this would involve a scatter instruction.

Dan
  • 33,953
  • 24
  • 61
  • 87

1 Answers1

0

After an A, you are asuming that there will be at least another A. Therefore, you asume that the sequence ends with an A. If it doesn't, you pick the wrong final letter.

You just need to count the As. Don't start with 1. Start with 0. Only increase this count when you find an A.

Or... Update:

Input:       A B B A A B B A
Selection:   1 0 0 1 1 0 0 1
Scan:        0 1 1 1 2 3 3 3 4
                             ^
0 - A                        |
1 - A                        Four elements
2 - A
3 - A


Input:       A B B A A B B B
Selection:   1 0 0 1 1 0 0 0
Scan:        0 1 1 1 2 3 3 3 3
                             ^
0 - A                        |
1 - A                        Three elements
2 - A
comocomocomocomo
  • 4,772
  • 2
  • 17
  • 17
  • I'm not really sure I get this answer. You're suggesting I also need to store the total number as I build the selection from the list? That's tricky to do efficiently in parallel without introducing a reduction phase for the selection, which seems like a big overhead considering it's just one final piece of data I'm after. I expected that I'd be able to purely use the scan once I've built it and discard the input and selection? – Dan Feb 06 '13 at 12:08
  • How do you get to write the first 3? – comocomocomocomo Feb 06 '13 at 12:30
  • Does the first 3 mean that there are 3 As or does it mean that there are 4? – comocomocomocomo Feb 06 '13 at 12:32
  • Sure, you're just suggesting I store a total as final_scan_value + (final_selection_value == 1), which does tell you what you need to know. I'm not sure you think the algorithm does the same thing as I think it does though? I'll update the question. – Dan Feb 06 '13 at 13:24
  • As in my update, I would be required to read this value and use this as a way to limit the results of the scan, rather than just process it directly. – Dan Feb 06 '13 at 13:28
  • "I'm not sure you think the algorithm does the same thing as I think it does though?" Maybe I didn't really understand your question. What language are you using? Can you detail your algorithm? Do you have the problem in the code you added, or in another implementation? Maybe parallel? – comocomocomocomo Feb 06 '13 at 14:24