0

I have a product array which looks like table below:

+---------------------------+--------------------------------+--------------------------------+
|    name                   |  review                        | word_count                     |
+---------------------------+--------------------------------+--------------------------------+
|                           |                                | {'and': 5, 'wipes': 1,         |
| Planetwise                |  These flannel wipes are OK,   | 'stink': 1, 'because' : 2, ... |
| Flannel Wipes             |  but in my opinion ...         |                                |
|                           |                                |                                |
+---------------------------+--------------------------------+--------------------------------+
|                           |                                | {'and': 3, 'love': 1,          |
| Planetwise                |  it came early and was not     | 'it': 2, 'highly': 1, ...      |
| Wipes Pouch               |  disappointed. i love ...      |                                |
|                           |                                |                                |
+---------------------------+--------------------------------+--------------------------------+
|                           |                                | {'shop': 1, 'noble': 1,        |
|                           |                                | 'is': 1, 'it': 1, 'as': ...    |
| A Tale of Baby's Days     |  Lovely book, it's bound       |                                |
|  with Peter Rabbit ...    |  tightly so you may no ...     |                                |
|                           |                                |                                |
+---------------------------+--------------------------------+--------------------------------+

Basically the word_count column contains a dictionary(key : value) of word occurrence of review columns sentences.

Now I want to build a new column name and which should contain value of and in word_count dictionary, if and exists as a key in the word_count column, then the value, if it doesn't exist as a key, then 0.

For first 3 rows the new and column looks something like this:

+------------+
|    and     |
+------------+
|            |
| 5          |
|            |
|            |
+------------+
|            |
| 3          |
|            |
|            |
+------------+
|            |
| 0          |
|            |
|            |
+------------+

I wrote this code and it's working correctly:

def wordcount(x):
    if 'and' in x:
        return x['and']
    else:
        return 0

products['and'] = products['word_count'].apply(wordcount);

My question: Is there any way I can do this using lambda?

What I've done so far is:

products['and'] = products['word_count'].apply(lambda x : 'and' in x.keys());

This returns only 0 or 1 in columns. What can I add to the line above so that products['and'] contains the value of and the key when it exists as a key in products['word_count']?

I'm using ipython notebook and graphlab.

rimonmostafiz
  • 1,341
  • 1
  • 15
  • 33

2 Answers2

3

You have the right idea. Just return the value of x['and'] if it exists, otherwise 0.

For example:

data = {"word_count":[{"foo":1, "and":5}, 
                      {"foo":1}]}
df = pd.DataFrame(data)
df.word_count.apply(lambda x: x['and'] if 'and' in x.keys() else 0)

Output:

0    5
1    0
Name: word_count, dtype: int64
andrew_reece
  • 20,390
  • 3
  • 33
  • 58
2

I'm not sure what products['word_count'].apply(wordcount) does, but from the rest of your question, while you could do something like the following with a lambda:

products['and'] = (
    lambda p: p['and']['and'] if 'and' in p['and'] else 0)(products)

It's kind of ugly and awkward, so I'd would recommend using the built-in dictionary get() method instead because it's debugged, shorter, easier to maintain, and quicker:

products['and'] = products['and'].get('and', 0)

Your fixation on using a lambda reminds me of what some call the Law of the Instrument: "...it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail".

martineau
  • 119,623
  • 25
  • 170
  • 301