Kedro is an open source Python library that helps you build production-ready data and analytics pipelines
Questions tagged [kedro]
202 questions
0
votes
1 answer
Does kedro data catalog accept .arrow files?
While using Kedro I want to load some data and work with it. To do that, one has to register the data in a conf/base/catalog.yml file.
The Kedro Documentation of the Data Catalog explains how one can register data for Kedro to load. However, there…

Jamal Rnjbal
- 25
- 6
0
votes
1 answer
kedro jupyter notebook in command prompt returns kedro.framework.cli.jupyter.single kernelspec manager' could not be imported"
I have been trying to activate jupyter notebooks in a kedro context for over 24 hours now and I receive the same error all the time. I have searched around and no one seems to be able to solve this problem. I have created a…

MDLA
- 1
- 2
0
votes
1 answer
Kedro (Python) DeprecationWarning: `np.bool8`
When I try to create a new kedro project or run an existing one, I get the following deprecation warning (see also screenshot below). As far as I understand the warning is neglebile, however, as I am trying to setup a clean project, I would like to…

SysRIP
- 159
- 8
0
votes
0 answers
Kedro register dataset from a board from pins package
For my project I want to use a combination of kedro for the pipeline orchestration and pins for data and model versioning. I have some data which I stored on a board from the pins package.
As I have multiple versions,
I am not sure how to specify…

Mischa
- 137
- 8
0
votes
0 answers
how to call kedro pipline or nodes in Django framework
I have to call Kedro nodes or pipelines in Django API, and need to use Kedro pipelines or nodes output as input in Django API. Not getting any solution. Please suggest some solution for this.
How to call Kedro pipeline or nodes in Django APIs?

Pallavi Kumari
- 27
- 1
- 4
0
votes
1 answer
Grouping raw datasets in a kedro visualization
I am looking for a way to group all of the raw datasets in a kedro pipeline visualization into one collapsible/expandable "node", similar to the way that namespaces are collapsible/expandable. In order to do this with a namespace, however, it seems…
0
votes
3 answers
Kedro - Getting path to item in the datacatalog
I'm training an nlp model using spacy. I have the preprocessing steps all written as a pipeline, and now I need to do the training. According to spacy's documentation I need to run the following command:
python -m spacy train config.cfg --output…

João Areias
- 1,192
- 11
- 41
0
votes
1 answer
Can't run KedroSession with `from_inputs` parmeter: ValueError: Pipeline does not contain data_sets named [...]
In jupyter notebook, when I run session.run(pipeline_name='sim', from_inputs=['measurements', 'params:simulation']), passing datasets & params specified in catalog.yaml, everything works fine. However, when I want to run it with a dataset that I…

ilja
- 109
- 7
0
votes
1 answer
How to generate kedro pipelines automatically (like DataEngineerOne does)?
Having seen the video of DataEngineerOne: How To Use a Parameter Range to Generate Pipelines Automatically I want to automate a pipeline that simulates an electronic circuit. I want to do a grid search over multiple central frequencies of a bandpass…

ilja
- 109
- 7
0
votes
2 answers
How can I run `catalog.load` in a non-IPython context?
In IPython I can run data = catalog.load('my_dataset') in order to load a dataset specified as 'my_dataset' in the catalog.yml file. What's the equivalent command in a pthon script? What do I need to import?

ilja
- 109
- 7
0
votes
1 answer
Include Quarto rendering in kedro pipeline and pass it inputs/outputs
I am using kedro to make some comparative analysis.
I am using the quarto python package providing a wrapper to the quarto cli through the render function. This function will take a qmd file as input and generate a html report from it while…

Oneira
- 1,365
- 1
- 14
- 28
0
votes
2 answers
kedro PartitionedDataSet lazy saving to spare memory?
I am working with PartionedDataSet in kedro. One of the data set is of type pillow.ImageDataSet:
raw_images:
type: PartitionedDataSet
<<: *data_path_on_disk
dataset:
type: pillow.ImageDataSet
filename_suffix:…

Oneira
- 1,365
- 1
- 14
- 28
0
votes
0 answers
Kedro template configuration does not load globals.yml configuration into catalog.yml FOR Jupyter Lab
It works for the CLI ( kedro run ... ) but not for Jupyter Lab. I have just recently upgraded from 0.17.1 to 0.18.3.
Have made changes to settings.py which uses the Templated Config Loader ( just like Kedro template configuration does not load…

Vincent Tan
- 23
- 3
0
votes
1 answer
import fsspec throws error (AttributeError: 'EntryPoints' object has no attribute 'get')
import fsspec throws error (AttributeError: 'EntryPoints' object has no attribute 'get')
[]

Abhijit Das
- 111
- 2
- 8
0
votes
1 answer
How to change the kedro configuration environment in jupyter notebook?
I want to run a kedro pipeline in the base env using jupyter notebook. I do this the following way:
%reload_kedro --env=base
session.run(pipeline_name='dpfm1')
Doing this, the %reload_kedro command raises the following error:
RuntimeError: Could…

ilja
- 109
- 7