pyspark 2.2 'DataFrame' object has no attribute 'map' , backward compatibility is missing how to solve it

Question

When I am working with Spark 1.6 below code is working fine:

ddl = sqlContext.sql("""show create table {mytable }""".format(mytable="""mytest.my_dummytable"""))
map(''.join, ddl\
.map(lambda my_row: [str(data).replace("`", "'") for data in my_row])\
.collect())

However, when I moved to spark 2.2 I got the following exception:

---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)
<ipython-> in <module>()
      1 ddl = sqlContext.sql("""show create table {mytable }""".format(mytable ="""mytest.my_dummytable"""))
----> 2 map(''.join, ddl..map(lambda my_row: [str(data).replace("`", "'") for data in my_row]).collect())

spark2/python/pyspark/sql/dataframe.py in __getattr__(self, name)
            if name not in self.columns:
                raise AttributeError(
->                  "'%s' object has no attribute '%s'" % (self.__class__.__name__, name))
            jc = self._jdf.apply(name)
            return Column(jc)

AttributeError: 'DataFrame' object has no attribute 'map'

score 1 · Answer 1 · answered Dec 08 '17 at 10:18

1

You have to call .rdd first. Spark 2.0 stopped aliasing df.map() to df.rdd.map(). See this.

answered Dec 08 '17 at 10:18

Justin

348
3
14

here we are missing backward compatibility please suggest me ..how to solve the problem with out changing the existing code – Dastagiri Shaik Dec 08 '17 at 11:43
@DastagiriShaik You just have to call `.rdd` on your dataframe before calling `.map`. – Justin Dec 08 '17 at 12:12
@DastagiriShaik In your case, it’s `sqlContext.sql(“...”).rdd.map(...)`. – Justin Dec 08 '17 at 12:20
We deployed the project with pyspark 1.6 already.. when i want to migrate to 2.2 .. i am getting problem..is there any way to work ...with out changing existing code... – Dastagiri Shaik Dec 08 '17 at 12:22
Oh I see what you mean. I don’t believe that’s possible unfortunately. – Justin Dec 08 '17 at 12:29

pyspark 2.2 'DataFrame' object has no attribute 'map' , backward compatibility is missing how to solve it

1 Answers1