0

So let's say I have two solids. The first does some computations and writes a file to disk. The second solid takes that file and does other things with it, but it needs its filesystem path in order to open it. I can do this with two yields (one for the AssetMaterialization and the other for the str Output) and explicitly putting the Output in the second solid call:

from dagster import (AssetKey, AssetMaterialization, EventMetadataEntry,
                     Output, execute_pipeline, pipeline, solid)

@solid
def yield_asset(context):
    yield AssetMaterialization(
        asset_key=AssetKey('my_dataset'),
        description='Persisted result to storage',
        metadata_entries=[
            EventMetadataEntry.text('Text-based metadata for this event',
                                    label='text_metadata'),
            EventMetadataEntry.fspath('/path/to/data/on/filesystem'),
            EventMetadataEntry.url('http://mycoolsite.com/url_for_my_data',
                                   label='dashboard_url'),
        ],
    )
    yield Output('/path/to/data/on/filesystem')


@solid
def print_asset_path(context, asset_path: str):
    # do stuff with `asset_path`
    context.log.info(asset_path)


@pipeline
def some_pipeline():
    asset_path = yield_asset()
    print_asset_path(asset_path)


if __name__ == "__main__":
    result = execute_pipeline(some_pipeline)

This works fine, and you should get the info message in the logs (2021-03-16 13:23:29 - dagster - INFO - system - 366248ec-6a83-462f-b62f-9fb2514f6f80 - print_asset_path - /path/to/data/on/filesystem) and the AssetMaterialization in dagit.

However, this is kind of inconvenient, since I need to explicitly yield an Output with the filesystem path that I need. Is it possible, and how, to reference the AssetMaterialization in the second solid, and use its properties directly?

Something like (won't work):

@solid
def print_asset_path(context):
    asset_path = context.assets.get_asset_by_key(`my_key`).fspath
    # do stuff with `asset_path`
    context.log.info(asset_path)
cyau
  • 449
  • 4
  • 14

1 Answers1

1

The code you've provided is currently the best way to accomplish this in Dagster.

If the fspath is known at before the solid itself executes, then the directions outlined in these two issues (not yet implemented) might offer a more elegant solution:

Sandy Ryza
  • 265
  • 1
  • 8