Questions tagged [dagster]

Dagster is an open source system for building modern data applications.

Dagster, by Elementl, is a set of abstractions for building self-describing, testable, and reliable data applications. It uses functional data programming, gradual/optional typing, and testability to facilitate composition of data applications from DAGs of solids, its basic computational unit.

142 questions
2
votes
1 answer

how to specify where to materialize dagster assets?

I'm new to dagster. Currently when I materialize assets they end up in /my-dagster-project/tmpav872908/storage/{assetkey} How do I specify where the assest should be stored?
MYK
  • 1,988
  • 7
  • 30
2
votes
0 answers

Log hyperlinks in Dagit?

Hi, I'd like to be able to show hyperlinks in Dagit logs, so that if a process fails then the link can be clicked in the log to open the offending file directly. I've tried just entering path strings and html, but neither of these approaches…
Phil T
  • 67
  • 7
2
votes
1 answer

How to specify/use idempotent "date of execution" within dagster assets/jobs?

Coming from airflow, I used jinja templates such as {{ds_nodash}} to translate the date of execution of a dag within my scripts. For example, I am able to detect and ingest a file at the first of August 2022 if it is in the format :…
Imad
  • 2,358
  • 5
  • 26
  • 55
2
votes
1 answer

Dagster cannot connect to mongodb locally

I was going through Dagster tutorials and thought it be a good exercise to connect to my local mongodb. from dagster import get_dagster_logger, job, op from pymongo import MongoClient @op def connection(): client =…
user3738936
  • 936
  • 8
  • 22
2
votes
1 answer

Handling user input in dagster

I am new to dagster and I am trying to understand how user inputs are handled by it. I am testing this out with the following piece of code: from dagster import job, op @op def input_string(): ret = input('Enter string') …
cpowr
  • 33
  • 5
2
votes
1 answer

Dynamically scheduling Dagster jobs

I'm wondering if it is possible to overwrite the cron schedule for a job. In my case, I want to run a Dagster job on every 6th business day for every month. So, I wrote a Python function that returns the next 6th business day of the upcoming month…
peter
  • 21
  • 1
2
votes
1 answer

Define resource_defs in dagster job sensor

I am trying to build sensor for the execution of pipeline/graph. The sensor would check on different intervals and executes the job containing different ops. Now the Job requires some resource_defs and config. In the offical documentation I don't…
zafar
  • 129
  • 1
  • 4
2
votes
1 answer

DAGSTER: async ops and jobs and dynamic docker-ops

Here I have 2 questions. I need to run an aiohttp session which shall simultaneously make several requests to different urls and download several files and return a list of absolute paths to these files on disk. This list shall be passed to another…
2
votes
2 answers

DagsterUnmetExecutorRequirementsError with dagster CLI during tutorial

I just started following the dagster tutorial. I managed to get the hello_cereal job running with dagit and the Python API, but for some reason when trying with dagster CLI dagster job execute -f hello_cereal.py I am getting a…
alxthm
  • 57
  • 1
  • 6
2
votes
1 answer

Dagster chaining resources

I've recently picked up Dagster to evaluate as an alternate to Airflow. I haven't been able to wrap my head around the concept of resources and looking to understand if what I'm trying to do is possible or can be achieved better in a different…
Gayathri
  • 274
  • 3
  • 17
2
votes
1 answer

Adding additional parameters to a solid function

I want to add additional parameters when calling a solid, that inherits from another solid as like: from dagster import pipeline, repository, schedule, solid, InputDefinition, solid @solid def hello(): return 1 @solid( input_defs=[ …
Alejandro A
  • 1,150
  • 1
  • 9
  • 28
2
votes
0 answers

How to define a composite solid with multiple arguments, including `Nothing`?

I have a solid that needs to run after 2 solids. One will return a value, another doesn't return anything but has dependency solids and will take time to run. I execute the pipeline in multiprocessing mode, where solids run at the same time if they…
metinsenturk
  • 421
  • 7
  • 9
2
votes
1 answer

How do I tell Dagit (the Dagster GUI) to run on an existing Dask cluster?

I'm using dagster 0.11.3 (the latest as of this writing) I've created a Dagster pipeline (saved as pipeline.py) that looks like this: @solid def return_a(context): return 12.34 @pipeline( mode_defs=[ ModeDefinition( …
user5406764
  • 1,627
  • 2
  • 16
  • 23
2
votes
2 answers

Dagster start pipeline from another pipeline using its outputs

How am I supposed to start a pipeline B after pipeline A completes, and use pipeline A's outputs into pipeline B? A piece of code as a starting point: from dagster import InputDefinition, Nothing, OutputDefinition, pipeline, solid @solid def…
cyau
  • 449
  • 4
  • 14
2
votes
1 answer

Testing a dagster pipeline

Summary: Dagster run configurations for Dagit vs. PyTest appear to be incompatible for my project I've been getting errors trying to run pytest on a pipeline and I'd really appreciate any pointers. I've consistently gotten errors of the…
jumbolaya
  • 23
  • 5
1
2
3
9 10