0

I'm trying to create a custom DataSet class within the kedro framework. I need some help understanding how to combine values from the credentials.yml file.

  1. what is the kedro way of handling the 'mongo_url' property in the catalog entry? how do i map the values from credentials to the catalog entry?
  2. what does the class init method look like?

catalog.yml

rss_feed_load:
  type: kedro_workbench.extras.datasets.RSSDataSet.RSSFeedLoad
  mongo_url: "mongodb+srv://<username>:<password>@bighatcluster.wamzrdr.mongodb.net/"
  mongo_db: "TBD"
  mongo_collection: "TBD"
  mongo_table: "TBD"
  credentials: mongo_atlas

credentials.yml

mongo_atlas:
  username: <username>
  password: <password>

my first attempt, not sure

class RSSFeedLoad(AbstractDataSet):
    def __init__(self, mongo_url: str, mongo_db: str, mongo_collection: str, mongo_table: str, credentials: Dict[str, Any], data: Any = None):
        self._data = data <- this is a list of dictionaries coming from previous node, not sure if I pass in the data when instance is created or in the _load() method.
        self._mongo_url = mongo_url <- where do I build the string that gets passed here
        self._mongo_db = mongo_db
        self._mongo_collection = mongo_collection
        self._mongo_table = mongo_table
        self._username = credentials['username'] <- do I need/is it a bad idea to store the username/password in the class attributes?
        self._password = credentials['password']

where does this custom class get called? do I reference it in the outputs='' of a node?

node(
            func=load_rss_feed,
                inputs='processed_rss_items',
                outputs='rss_feed_load',
                name="load_rss_feed",
        )

this is my first attempt building a custom DataSet so not very sure if I'm doing things correctly. thanks very much. :)

Emilio
  • 33
  • 2

0 Answers0