2

I have a solid that needs to run after 2 solids. One will return a value, another doesn't return anything but has dependency solids and will take time to run.

I execute the pipeline in multiprocessing mode, where solids run at the same time if they don't have dependencies defined.

Below is the sample situation I am looking for. Say I have below solids.

@solid(input_defs=[InputDefinition("start", Nothing)])
def solid_a(context):
    import time
    time.sleep(2)
    context.log.info('yey')

@solid
def solid_b(context):
    return 1

@composite_solid
def my_composite_solid(wait_solid_a: Nothing, solid_b_output: int):
    some_other_solid(solid_b_output)

And when executed, these solids will be running in the below timeline.

Time Passed solid
0 pipeline starts...
1 sec solid_b started
3 sec solid_a dependency solids are running. solid_a did not started yet.
5 sec solid_b finished
10 sec solid_a started now
15 sec solid_a finished
20 sec my_composite_solid should start now.

So, according to this timeline, in order for my_composite_solid to start, I need both solid_a and solid_b to finish executing. However, when I make this, dagster throws an error saying:

dagster.core.errors.DagsterInvalidDefinitionError: @composite_solid 'my_composite_solid' has unmapped input 'wait_solid_a'. Remove it or pass it to the appropriate solid invocation.

If I don't put the solid_a output as a dependency to my_composite_solid, it will start immediately after the result of solid_b. What should I do?

metinsenturk
  • 421
  • 7
  • 9

0 Answers0