1

Say I have two streams:

Stream 1: [1,3],[2,4]
Stream 2: [2,5],[3,2]

A regular merge would produce a Stream 3, like this:

[1,3],[2,4],[2,5],[3,2]

I would like to merge the stream whilst preserving the order in which the tuple was emitted, so if [2,5] was emitted at time 1, [1,3] was emitted at time 2, [3,2] at time 3 and [2,4] at time 4, the resulting stream would be:

[2,5],[1,3],[3,2],[2,4]

Is there anyway way to do this and, if so, how? Some sample code would be appreciated as I'm a complete Trident rookie who has recently been thrust into a Trident based project.

Thanks in advance for your help,

Eli

skjcyber
  • 5,759
  • 12
  • 40
  • 60
E Shindler
  • 425
  • 7
  • 27

1 Answers1

1

You have to use an external data storage using trident persistent. Sorted set of redis should serve your purpose, I guess.

MORE INFO

If you go through this https://github.com/nathanmarz/storm/wiki/Trident-tutorial, you can get how to use memcache as the store for words count.

Similarly, you can write a stream backup on Redis (if you are not familiar with redis try out, http://redis.io/commands#sorted_set). I think redis sorted set will serve as a purpose for your case.

If you want a persistent storage for your data, you can think of using other NOSQL solution like mongo and then you can always easily index your final data on your time. That will easily provide the sort functionality you want. And what not someone has already written a mongo trident, https://github.com/sjoerdmulder/trident-mongodb.

Let me know if you are still confused and about what.

Global Warrior
  • 5,050
  • 9
  • 45
  • 75
  • Can you elaborate a bit, with a bit of sample code. I am a real rookie. Thanks very much – E Shindler Nov 06 '13 at 20:51
  • I've currently got it working using MySQL as my external data source. Eventually I'll have to use something a bit more sophisticated, but it will do for now. Thanks very much for your help. – E Shindler Nov 07 '13 at 11:52