0

I have got something like this.

Sentence                                        Target
We regret to inform you about the result.        1
We are glad to inform you about the result.      2
We would like to inform you about the result.   3
We are surprised to see the result.              4

I want a word count that looks something like this

Word    Target 1    Target 2    Target 2    Target 4
Result     1           1            1           1
Inform     1           1            1           0
Surprised   0           0           0           1

... and so on. How do I do this?

cs95
  • 379,657
  • 97
  • 704
  • 746
Jam
  • 31
  • 7
  • 2
    You will get a much more friendly reception and much better help here if you show what code you have tried so far and describe what problems you were having with it. Without code, your question looks like a request for a free homework-writing service and many people don't like that. – John1024 Feb 13 '18 at 05:27
  • Sorry, that was a typo. I meant to ask if it was with pandas. – cs95 Feb 13 '18 at 05:34
  • yes..its a pandas dataframe – Jam Feb 13 '18 at 05:36

1 Answers1

2

You'll need to

  1. remove punctuation and lowercase the data
  2. split on whitespace
  3. stack to create a series
  4. groupby on Target
  5. find the value_counts of words for each target
  6. unstack the result for your desired output

df.Sentence.str.replace('[^\w\s]', '')\
  .str.lower()\
  .str.split(expand=True)\
  .set_index(df.Target)\
  .stack()\
  .groupby(level=0)\
  .value_counts()\
  .unstack(0, fill_value=0)\
  .add_prefix('Target ')


Target     Target 1  Target 2  Target 3  Target 4
about             1         1         1         0
are               0         1         0         1
glad              0         1         0         0
inform            1         1         1         0
like              0         0         1         0
regret            1         0         0         0
result            1         1         1         1
see               0         0         0         1
surprised         0         0         0         1
the               1         1         1         1
to                1         1         1         1
we                1         1         1         1
would             0         0         1         0
you               1         1         1         0
cs95
  • 379,657
  • 97
  • 704
  • 746