I am trying to aggregate the data on Order_date & order_status. I have written two combineByKey transformations. One of them works and the other one fails.
Below is the data snapshot of the ordersjoined Dataframe: [Row(order_id=1, order_date=1374735600000, order_customer_id=11599, order_status=u'CLOSED', order_item_id=1, order_item_order_id=1, order_item_product_id=957, order_item_quantity=1, order_item_subtotal=299.98001098632812, order_item_product_price=299.98001098632812), Row(order_id=2, order_date=1374735600000, order_customer_id=256, order_status=u'PENDING_PAYMENT', order_item_id=2, order_item_order_id=2, order_item_product_id=1073, order_item_quantity=1, order_item_subtotal=199.99000549316406, order_item_product_price=199.99000549316406), Row(order_id=2, order_date=1374735600000, order_customer_id=256, order_status=u'PENDING_PAYMENT', order_item_id=3, order_item_order_id=2, order_item_product_id=502, order_item_quantity=5, order_item_subtotal=250.0, order_item_product_price=50.0), Row(order_id=2, order_date=1374735600000, order_customer_id=256, order_status=u'PENDING_PAYMENT', order_item_id=4, order_item_order_id=2, order_item_product_id=403, order_item_quantity=1, order_item_subtotal=129.99000549316406, order_item_product_price=129.99000549316406), Row(order_id=4, order_date=1374735600000, order_customer_id=8827, order_status=u'CLOSED', order_item_id=5, order_item_order_id=4, order_item_product_id=897, order_item_quantity=2, order_item_subtotal=49.979999542236328, order_item_product_price=24.989999771118164), Row(order_id=4, order_date=1374735600000, order_customer_id=8827, order_status=u'CLOSED', order_item_id=6, order_item_order_id=4, order_item_product_id=365, order_item_quantity=5, order_item_subtotal=299.95001220703125, order_item_product_price=59.990001678466797), Row(order_id=4, order_date=1374735600000, order_customer_id=8827, order_status=u'CLOSED', order_item_id=7, order_item_order_id=4, order_item_product_id=502, order_item_quantity=3, order_item_subtotal=150.0, order_item_product_price=50.0), Row(order_id=4, order_date=1374735600000, order_customer_id=8827, order_status=u'CLOSED', order_item_id=8, order_item_order_id=4, order_item_product_id=1014, order_item_quantity=4, order_item_subtotal=199.91999816894531, order_item_product_price=49.979999542236328), Row(order_id=5, order_date=1374735600000, order_customer_id=11318, order_status=u'COMPLETE', order_item_id=9, order_item_order_id=5, order_item_product_id=957, order_item_quantity=1, order_item_subtotal=299.98001098632812, order_item_product_price=299.98001098632812), Row(order_id=5, order_date=1374735600000, order_customer_id=11318, order_status=u'COMPLETE', order_item_id=10, order_item_order_id=5, order_item_product_id=365, order_item_quantity=5, order_item_subtotal=299.95001220703125, order_item_product_price=59.990001678466797), Row(order_id=5, order_date=1374735600000, order_customer_id=11318, order_status=u'COMPLETE', order_item_id=11, order_item_order_id=5, order_item_product_id=1014, order_item_quantity=2, order_item_subtotal=99.959999084472656, order_item_product_price=49.979999542236328), Row(order_id=5, order_date=1374735600000, order_customer_id=11318, order_status=u'COMPLETE', order_item_id=12, order_item_order_id=5, order_item_product_id=957, order_item_quantity=1, order_item_subtotal=299.98001098632812, order_item_product_price=299.98001098632812), Row(order_id=5, order_date=1374735600000, order_customer_id=11318, order_status=u'COMPLETE', order_item_id=13, order_item_order_id=5, order_item_product_id=403, order_item_quantity=1, order_item_subtotal=129.99000549316406, order_item_product_price=129.99000549316406), Row(order_id=7, order_date=1374735600000, order_customer_id=4530, order_status=u'COMPLETE', order_item_id=14, order_item_order_id=7, order_item_product_id=1073, order_item_quantity=1, order_item_subtotal=199.99000549316406, order_item_product_price=199.99000549316406), Row(order_id=7, order_date=1374735600000, order_customer_id=4530, order_status=u'COMPLETE', order_item_id=15, order_item_order_id=7, order_item_product_id=957, order_item_quantity=1, order_item_subtotal=299.98001098632812, order_item_product_price=299.98001098632812), Row(order_id=7, order_date=1374735600000, order_customer_id=4530, order_status=u'COMPLETE', order_item_id=16, order_item_order_id=7, order_item_product_id=926, order_item_quantity=5, order_item_subtotal=79.949996948242188, order_item_product_price=15.989999771118164), Row(order_id=8, order_date=1374735600000, order_customer_id=2911, order_status=u'PROCESSING', order_item_id=17, order_item_order_id=8, order_item_product_id=365, order_item_quantity=3, order_item_subtotal=179.97000122070312, order_item_product_price=59.990001678466797), Row(order_id=8, order_date=1374735600000, order_customer_id=2911, order_status=u'PROCESSING', order_item_id=18, order_item_order_id=8, order_item_product_id=365, order_item_quantity=5, order_item_subtotal=299.95001220703125, order_item_product_price=59.990001678466797), Row(order_id=8, order_date=1374735600000, order_customer_id=2911, order_status=u'PROCESSING', order_item_id=19, order_item_order_id=8, order_item_product_id=1014, order_item_quantity=4, order_item_subtotal=199.91999816894531, order_item_product_price=49.979999542236328), Row(order_id=8, order_date=1374735600000, order_customer_id=2911, order_status=u'PROCESSING', order_item_id=20, order_item_order_id=8, order_item_product_id=502, order_item_quantity=1, order_item_subtotal=50.0, order_item_product_price=50.0)]
Below CombineByKey transformation is working: rddresult=ordersjoined.map(lambda x:((x[1],x[3]),(float(x[8]),str(x[0])))).combineByKey(lambda x:(x[0],set(x[1])),lambda x,y: (x[0]+y[0],x[1].union(set(y[1]))),lambda x,y:(x[0]+y[0],x[1].union(y[1])))
Below CombineByKey transformation is failing: rddresult=ordersjoined.map(lambda x:((x[1],x[3]),(float(x[8]),str(x[0])))).combineByKey(lambda x:(x[0],set(x[1])),lambda x,y: (x[0]+y[0],x[1].add(y[1])),lambda x,y:(x[0]+y[0],x[1].union(y[1])))
I get the following error message:
AttributeError: 'NoneType' object has no attribute 'add'
I would like to understand what is wrong with the failed combineByKey transformation.