0

I wrote a workflow distribution platform that is primarily used for text extraction of different file types. It works by processing a file and then recursing over all embedded items in that file that are text-extractable. Each worker item created is uniquely identified by a GUID and also has a parent GUID. For a file with no embedded items the worker item GUID and the parent GUID are equal. If a file has embedded items a worker item is created for each embedded item having a unique GUID and it's parent GUID equal to the file's GUID. As an example an Outlook message file can contain attachments which in turn the attachments may contain embedded items (i.e. inserting a spreadsheet in a Word doc).

I want to provide an interface that will send notifications to clients when the recursion on any item in the processing of the original file has completed. I have already written a tree structure to do what I want but it seems kind of crappy and naive. Is there a known pattern or library that provides what I have outlined above?

user481779
  • 1,071
  • 2
  • 14
  • 28

1 Answers1

1

You can use a quadtree and a quadkey. A quadkey is usually used in map applications but with a quadkey you can also sort the tree in a different order. It can help to distributed parallel processes when you want them assign to specific cores.

Micromega
  • 12,486
  • 7
  • 35
  • 72