Questions tagged [dagster]

Dagster is an open source system for building modern data applications.

Dagster, by Elementl, is a set of abstractions for building self-describing, testable, and reliable data applications. It uses functional data programming, gradual/optional typing, and testability to facilitate composition of data applications from DAGs of solids, its basic computational unit.

142 questions
0
votes
1 answer

Does dagster support dependencies between solids with no outputs?

I am building a prototype pipeline that does two things: (solid) Clears files out of an existing directory (solid) Runs a batch process to dump data into that directory. Step #1 is all side-effect, and has no output to pass to #2. Is it possible…
drewh
  • 10,077
  • 7
  • 34
  • 43
0
votes
3 answers

How can you ensure, that the same pipeline is not executed twice at the same time

Hey :) i have a questions in regards to locking or mutex behavior. Scenarios: Lets assume the following scenarios: The pipeline is working with some local files. These files were placed by CI-CD jobs. After processing i'd like to remove the files.…
Thobial
  • 3
  • 1
0
votes
3 answers

Dagster: Multiple and Conditional Outputs (Type check failed for step output xxx PySparkDataFrame)

I'm executing the Dagster tutorial, and I got stuck at the Multiple and Conditional Outputs step. In the solid definitions, it asks to declare (among other things): output_defs=[ OutputDefinition( name="hot_cereals",…
Bruno Ambrozio
  • 402
  • 3
  • 18
0
votes
1 answer

Noneable type for Field in solid config not allowing null values

I would like Dagster to accept empty parameters in the config.yaml and treat them as having a value of None. When I start dagit I can see that the parameter is null. This makes sense because I've left the value of the parameter empty in the…
K.Naga
  • 76
  • 1
  • 6
0
votes
1 answer

Producing files in dagster without caring about the filename

In the dagster tutorial, in the Materializiations section, we choose a filename (sorted_cereals_csv_path) for our intermediate output, and then yield it as a materialization: @solid def sort_by_calories(context, cereals): # Sort the data…
Migwell
  • 18,631
  • 21
  • 91
  • 160
0
votes
2 answers

dagster pipeline executes successfully when run with `execute_pipeline` but not when run with dagit

I'm running into a LoweringError that has to do with numba compilation when running a dagster pipeline through dagit, but not when run directly with execute_pipeline. Not really sure how to go about debugging it. Minimal working example (file…
-1
votes
1 answer

How can I partition dagster assets by year?

Not seeing a built in definition to partition assets by year (just hourly-monthly). Is there a way to manipulate the built in time definitions to accomplish this? Any help is appreciated!
1 2 3
9
10