Kedro is an open source Python library that helps you build production-ready data and analytics pipelines
Questions tagged [kedro]
202 questions
1
vote
0 answers
VersionNotFoundError when writing S3 bucket from a Custom Dataset
I have a custom dataset which I am using to write a popmon report to S3 bucket
class ReportDataset(AbstractVersionedDataSet):
def __init__(self, filepath: str, version: Version = None, credentials: Dict[str, Any] = None):
_credentials =…

Apoorva
- 11
- 2
1
vote
1 answer
jsonschema 4.4.0 does not provide the extra 'isoduration'
So I'm trying to run some piece of code and keep getting the following error:
File "/opt/conda/lib/python3.8/site-packages/pkg_resources/__init__.py", line 770, in resolve
raise DistributionNotFound(req,…

Maciej Zwoliński
- 33
- 4
1
vote
2 answers
Protobuf compatibility error when running Kedro pipeline
I have a Kedro pipeline that I want to run through a Python script, I think I have the minimum necessary code to do this, but everytime I try to run the pipeline through the script, I get a compatibility error regarding the protobuf version, but…

Gustavo Trivelatto
- 11
- 3
1
vote
2 answers
How to save keras model in kedro
I am able to save DNN Model in h5 format on s3. but when I import it in inference pipeline of kedro tool, I am getting blank?no predictions.
I made following changes in catalog.yml file:
model:
filepath:…

RajeshM
- 109
- 9
1
vote
1 answer
Facing kedro-airflow validation error from pydantic when running in docker
I am new to kedro and airflow. I am trying to deploy a kedro pipeline in airflow by using docker.
But while executing my DAG I get this error:
2022-01-27 16:17:19,659 - airflow.task - ERROR - Task failed with exception
Traceback (most recent call…

Kedro_newbie
- 11
- 1
1
vote
1 answer
How to access environment name in kedro pipeline
Is there any way to access the kedro pipeline environment name? Actually below is my problem.
I am loading the config paths as below
conf_paths = ["conf/base", "conf/local"]
conf_loader = ConfigLoader(conf_paths)
parameters =…

DataEnthusiast
- 39
- 8
1
vote
0 answers
Use an Azure ML compute cluster to run Kedro + Mlflow pipeline
I want to use an Azure Machine Learning compute cluster as a compute target to run a Kedro pipeline integrated with Mlflow.
Here's the code snippet (hooks.py) that integrates experiment tracking using Mlflow and Azure ML as backend/artifact…

Downforu
- 317
- 5
- 13
1
vote
1 answer
kedro DataSetError while loading PartitionedDataSet
I am using PartitionedDataSet to load multiple csv files from azure blob storage. I defined my data set in the datacatalog as below.
my_partitioned_data_set:
type: PartitionedDataSet
path: my/azure/folder/path
…

DataEnthusiast
- 39
- 8
1
vote
1 answer
kedro context and catalog missing from ipython session
I launched ipython session and trying to load a dataset.
I am running
df = catalog.load("test_dataset")
Facing the below error
NameError: name 'catalog' is not defined
I also tried %reload_kedro but got the below error
UsageError: Line magic…

DataEnthusiast
- 39
- 8
1
vote
0 answers
Kedro cannot find run
As a part of upgrading Kedro from 0.16.2 to 0.17.3 in our organization, I've made changes to Kedro related files in our codebase based on Kedro starter pyspark-iris on 0.17.3.
Now I get an error of Error: No such command 'run' on kedro…

Sandeep Gunda
- 171
- 1
- 10
1
vote
1 answer
Want to run Specific node or group of nodes and capture the output into a variable in kedro jupyter lab
I am new to kedro, I am trying to run Spaceflights tutorial. I want to run the complete data_processing_pipeline 'dp', and capture the output in a dataframe. I am running it on Jupyter Lab. I used the following command:
model_input_table =…

Aditya Narayan
- 21
- 3
1
vote
1 answer
How to let kedro execute nodes in sequence
I am trying to use kedro to run a workflow. Following figure is my workflow(node 1-3 is sequential and nodes 31, 32 and 33 is three branches which from node 3). You can see the kedro is running sequentially from 1 to 3, due to the clearly dependency…

William Huang
- 47
- 3
1
vote
2 answers
How to read/write/sync data on cloud with Kedro
In short: how can I save a file both locally AND on cloud, similarly how to set to read from local.
Longer description: There are two scenario, 1) building model 2) serving model through API. In building the model, a series of analysis is done to…

Areza
- 5,623
- 7
- 48
- 79
1
vote
1 answer
Kedro: ValueError: Pipeline does not contain nodes named ['preprocess_companies_node']
Similar to the question described earlier, I followed the spaceflights tutorial, at create pipeline step, I got the following error when running kedro run --node=preproces_companies_node
ValueError: Pipeline does not contain nodes named…

got2nosth
- 578
- 2
- 8
- 27
1
vote
1 answer
Kedro -- Create a dynamic node
I have a kedro node which returns a list of pandas dataframe. In another node I do_something() to the dataframes. For example:
def first_node():
"""returns a list"""
return list_item
def do_something(data):
"""perform same action to all list…

GreenTemple
- 11
- 1