6

I am using spark with python and I have a filter constraint as follows:

my_rdd.filter(my_func)

where my_func is a method I wrote to filter the rdd items based on my own logic. I have defined the my_func as follows:

def my_func(my_item):

{
...
}

Now, I want to pass another separate parameter to my_func, besides the item that goes into it. How can I do that? I know my_item will refer to one item that comes from my_rdd and how can I pass my own parameter (let's say my_param) as an additional parameter to my_func?

London guy
  • 27,522
  • 44
  • 121
  • 179
  • 1
    Possible duplicate of [Spark RDD - Mapping with extra arguments](http://stackoverflow.com/questions/33019420/spark-rdd-mapping-with-extra-arguments) – zero323 Dec 04 '15 at 13:11

1 Answers1

8

Using below lambda syntax and modify your my_func with extra parameters:

my_rdd.filter(lambda row: my_func(row,extra_parameter))
Shawn Guo
  • 3,169
  • 3
  • 21
  • 28