Questions tagged [kedro]

Kedro is an open source Python library that helps you build production-ready data and analytics pipelines

202 questions
1
vote
0 answers

VersionNotFoundError when writing S3 bucket from a Custom Dataset

I have a custom dataset which I am using to write a popmon report to S3 bucket class ReportDataset(AbstractVersionedDataSet): def __init__(self, filepath: str, version: Version = None, credentials: Dict[str, Any] = None): _credentials =…
Apoorva
  • 11
  • 2
1
vote
1 answer

jsonschema 4.4.0 does not provide the extra 'isoduration'

So I'm trying to run some piece of code and keep getting the following error: File "/opt/conda/lib/python3.8/site-packages/pkg_resources/__init__.py", line 770, in resolve raise DistributionNotFound(req,…
1
vote
2 answers

Protobuf compatibility error when running Kedro pipeline

I have a Kedro pipeline that I want to run through a Python script, I think I have the minimum necessary code to do this, but everytime I try to run the pipeline through the script, I get a compatibility error regarding the protobuf version, but…
1
vote
2 answers

How to save keras model in kedro

I am able to save DNN Model in h5 format on s3. but when I import it in inference pipeline of kedro tool, I am getting blank?no predictions. I made following changes in catalog.yml file: model: filepath:…
RajeshM
  • 109
  • 9
1
vote
1 answer

Facing kedro-airflow validation error from pydantic when running in docker

I am new to kedro and airflow. I am trying to deploy a kedro pipeline in airflow by using docker. But while executing my DAG I get this error: 2022-01-27 16:17:19,659 - airflow.task - ERROR - Task failed with exception Traceback (most recent call…
1
vote
1 answer

How to access environment name in kedro pipeline

Is there any way to access the kedro pipeline environment name? Actually below is my problem. I am loading the config paths as below conf_paths = ["conf/base", "conf/local"] conf_loader = ConfigLoader(conf_paths) parameters =…
1
vote
0 answers

Use an Azure ML compute cluster to run Kedro + Mlflow pipeline

I want to use an Azure Machine Learning compute cluster as a compute target to run a Kedro pipeline integrated with Mlflow. Here's the code snippet (hooks.py) that integrates experiment tracking using Mlflow and Azure ML as backend/artifact…
Downforu
  • 317
  • 5
  • 13
1
vote
1 answer

kedro DataSetError while loading PartitionedDataSet

I am using PartitionedDataSet to load multiple csv files from azure blob storage. I defined my data set in the datacatalog as below. my_partitioned_data_set: type: PartitionedDataSet path: my/azure/folder/path …
1
vote
1 answer

kedro context and catalog missing from ipython session

I launched ipython session and trying to load a dataset. I am running df = catalog.load("test_dataset") Facing the below error NameError: name 'catalog' is not defined I also tried %reload_kedro but got the below error UsageError: Line magic…
1
vote
0 answers

Kedro cannot find run

As a part of upgrading Kedro from 0.16.2 to 0.17.3 in our organization, I've made changes to Kedro related files in our codebase based on Kedro starter pyspark-iris on 0.17.3. Now I get an error of Error: No such command 'run' on kedro…
Sandeep Gunda
  • 171
  • 1
  • 10
1
vote
1 answer

Want to run Specific node or group of nodes and capture the output into a variable in kedro jupyter lab

I am new to kedro, I am trying to run Spaceflights tutorial. I want to run the complete data_processing_pipeline 'dp', and capture the output in a dataframe. I am running it on Jupyter Lab. I used the following command: model_input_table =…
1
vote
1 answer

How to let kedro execute nodes in sequence

I am trying to use kedro to run a workflow. Following figure is my workflow(node 1-3 is sequential and nodes 31, 32 and 33 is three branches which from node 3). You can see the kedro is running sequentially from 1 to 3, due to the clearly dependency…
1
vote
2 answers

How to read/write/sync data on cloud with Kedro

In short: how can I save a file both locally AND on cloud, similarly how to set to read from local. Longer description: There are two scenario, 1) building model 2) serving model through API. In building the model, a series of analysis is done to…
Areza
  • 5,623
  • 7
  • 48
  • 79
1
vote
1 answer

Kedro: ValueError: Pipeline does not contain nodes named ['preprocess_companies_node']

Similar to the question described earlier, I followed the spaceflights tutorial, at create pipeline step, I got the following error when running kedro run --node=preproces_companies_node ValueError: Pipeline does not contain nodes named…
got2nosth
  • 578
  • 2
  • 8
  • 27
1
vote
1 answer

Kedro -- Create a dynamic node

I have a kedro node which returns a list of pandas dataframe. In another node I do_something() to the dataframes. For example: def first_node(): """returns a list""" return list_item def do_something(data): """perform same action to all list…