1

I am creating a pipeline on a VertexAI Workbench using Kubeflow Lightweight components.

I start with a component to extract data from BigQuery and process it (and returning a dataframe). This dataframe will then be fed to another component:

@kfp.dsl.component
def turn_window_generator(df: pd.DataFrame) -> WindowGenerator:
....
    return wide_window

The input that goes in is a pd.DataFrame() and the output is a WindowGenerator class, responsible to transforming said dataframe into the input of the neural network in the next step.

Whenever I run that component (turn_window_generator), I get the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/var/tmp/ipykernel_5481/471668721.py in <module>
      1 @kfp.dsl.component
----> 2 def turn_window_generator(df: pd.DataFrame) -> WindowGenerator:
      3 
      4     filter_x_days = 7
      5     filtered_days = [i for i in range(0, filter_x_days)] + [29]

~/.local/lib/python3.7/site-packages/kfp/components/component_decorator.py in component(func, base_image, target_image, packages_to_install, pip_index_urls, output_component_file, install_kfp_package, kfp_package_path)
    125         output_component_file=output_component_file,
    126         install_kfp_package=install_kfp_package,
--> 127         kfp_package_path=kfp_package_path)

~/.local/lib/python3.7/site-packages/kfp/components/component_factory.py in create_component_from_func(func, base_image, target_image, packages_to_install, pip_index_urls, output_component_file, install_kfp_package, kfp_package_path)
    467             func=func)
    468 
--> 469     component_spec = extract_component_interface(func)
    470     component_spec.implementation = structures.Implementation(
    471         container=structures.ContainerSpecImplementation(

~/.local/lib/python3.7/site-packages/kfp/components/component_factory.py in extract_component_interface(func, containerized)
    228                     ' values for outputs are not supported.')
    229 
--> 230         type_struct = type_utils._annotation_to_type_struct(parameter_type)
    231         if type_struct is None:
    232             raise TypeError(

~/.local/lib/python3.7/site-packages/kfp/components/types/type_utils.py in _annotation_to_type_struct(annotation)
    525         return None
    526     if hasattr(annotation, 'to_dict'):
--> 527         annotation = annotation.to_dict()
    528     if isinstance(annotation, dict):
    529         return annotation

TypeError: to_dict() missing 1 required positional argument: 'self'

I expected to be able to create this component. I tried installing kfp --pre and some other changes but none of them seem to work. These are my installs and imports.

! pip3 install --upgrade {USER_FLAG} -q google-cloud-aiplatform \
                                        google-cloud-storage {USER_FLAG} \
                                        kfp --pre \
                                        google-cloud-pipeline-components \
                                        tensorflow

import google.cloud.aiplatform as aip
import kfp
from kfp.v2 import compiler
filipe
  • 275
  • 1
  • 1
  • 8

1 Answers1

0

Per this stackoverflow post, it seems like you are getting a common Python error that occurs when you try to call a class method without creating an instance of the class first. The self parameter refers to the instance of the class that is calling the method. If you don't create an instance, then there is no self to pass to the method.

To fix this error, you need to instantiate the class before calling its method. For example, if your class is called WindowGenerator, you can do something like this:

# create an instance of WindowGenerator
wg = WindowGenerator(dataframe)
# call the turn_window_generator method on the instance
wg.turn_window_generator()
Joevanie
  • 489
  • 2
  • 5