0

I'm looking for an appropriate design pattern to accomplish the following:

I want to extract some information from some "ComplexDataObject" (e.g. an Image) and save the relevant information in a more suitable format, let's call it a NiceDataObject.

Partial information is extracted from the ComplexDataObject in several steps. The output from one step may be required as input in a later step. Since output and input may be of different types, I don't know if I can use some of the existing patterns like "pipes and filters", "chain of responsibility" or the like?

The following piece of "code" hopefully makes it clear what I want to achieve

NiceDataObject ProcessingMethod1(ComplexDataObject cdo) {
    InfoType1 infoPiece1 = extractInfoMethod1(cdo)
    InfoType2 infoPiece2 = extractInfoMethod2(cdo, infoPiece1)
    InfoType3 infoPiece3 = extractInfoMethod3(cdo, infoPiece2)
    InfoType4 infoPiece4 = extractInfoMethod4(cdo, infoPiece2, infoPiece3)
    NiceDataObject structuredInfo = PutItTogether(cdo, infoPiece1, infoPiece2,
                                                  infoPiece3, infoPiece4)
    return structuredInfo
}

To make matters more complicated, I'd ideally also be able to have the possibility of handling another complex data type (say AnotherComplexDataObject) in the same manner to produce the desired NiceDataObject.

NiceDataObject ProcessingMethod2(AnotherComplexDataObject cdo) {
    InfoType1 infoPiece1 = extractInfoMethod1(cdo)
    InfoType5 infoPiece5 = extractInfoMethod5(cdo, infoPiece1)
    InfoType6 infoPiece6 = extractInfoMethod6(cdo, infoPiece5)
    NiceDataObject structuredInfo = PutItTogether2(cdo, infoPiece1, infoPiece6)
    return structuredInfo
}

This would allow me to write a general API function something like

NiceDataObject Process(SomeComplexDataObject cdo)

where SomeComplexDataObject is a baseclass for ComplexDataObject and AnotherComplexDataObject.

If possible, I'd like to "register" processing step methods (i.e. extractInfoMethod1, ..., extractInfoMethod6 above) for flexibility and since I want to be able to peek at the intermediary data.

If it matters, I'm using Python, and use a C-like "code" above only to illustrate that input and output typically are of different types.

lucasg
  • 10,734
  • 4
  • 35
  • 57
Mikael Call
  • 75
  • 1
  • 8
  • Pipes and filters might be ok, but what I don't like about it in this type of scenario is that you *might* need the data at a later step, which can cause temporal dependencies (e.g. filter A must run before filter B) and there's no way to enforce that apart from doing checks at runtime. – Augusto Jan 20 '14 at 09:53
  • Cannot edit the comment above... the problem I mentioned above is that the `ComplexDataObject` can become a Context Object (anti-Pattern): http://stackoverflow.com/questions/771983/what-is-context-object-design-pattern – Augusto Jan 20 '14 at 10:10
  • @Augusto I think your approach is interesting, i.e. Pipes and Filter + Map/Dict to send the additional information. I'll just have to read up on the potential downsides of the Context Object. Is it agreed upon that it actually is an anti-pattern or not? I'll look into this to decide if I can live with it. – Mikael Call Jan 20 '14 at 10:20
  • For me it's an anti-pattern... but I'm going to read the PDF that is linked in the other answer to see if there are some scenarios where it makes sense to use it. – Augusto Jan 20 '14 at 10:52

1 Answers1

0

More than pattern, you need a solid CPU\RAM optimization to do so. Otherwise for bulky image processing you will exhaust resources.

You can use the following design pattern as your base. https://www.codeproject.com/Articles/5272366/A-data-processing-design-pattern-for-intermittent

Create data processors to transform, convert or extract and then you can implement chain of responsibility to add various stages. Now each data processer is multithreaded so you can take advantage of time gaps. i.e. the current processor can finish its task before predecessor processor shares the next batch.

amarnath chatterjee
  • 1,942
  • 16
  • 15