How to count the number of words in a sentence and append to a list

Question

I have a list of sentences say:

["Hello all, how are you doing?", "Hi all, wassup", "Namaste", "Bonjour, ca va", "Privet, kak dela?"...]

And I want to count the number of words per sentence and plot a histogram.

When I am counting individual items like:

    seq = []
    seq.append(len(X_train[0].split()))
    seq

It gives me the result, which is fine. But, when I try for the whole hello list sequence of 28 sentences:

seq = [len(sentence.split()) for sentence in X_train]

I get the following error:

ttributeError                            Traceback (most recent call last)
<ipython-input-100-d9dec14bd2dd> in <module>()
----> 1 num_words = [len(sentence.split()) for sentence in X_train]
      2 #pd.Series(seq_len).hist(bins = 30)

<ipython-input-100-d9dec14bd2dd> in <listcomp>(.0)
----> 1 num_words = [len(sentence.split()) for sentence in X_train]
      2 #pd.Series(seq_len).hist(bins = 30)

AttributeError: 'float' object has no attribute 'split'

I have no clue why. Can you please explain?

Thanks!

What is `X_train`? The error seems to indicate it is a sequence of `float`? — Cory Kramer, Jun 09 '21 at 18:21
It looks like your sequence `for sentence in X_train` Has at least one float in it. Because of that, at some point `sentence` is a float, not a string, which has no definition for `split()` — AwesomeCronk, Jun 09 '21 at 18:22
`sequence` is of type `float` and not `string`. you cannot use a string method on a float. — stefan_aus_hannover, Jun 09 '21 at 18:22
Try str(sentence).split() or if you want to ignore those float entries from your list than put a condition if isinstance(sentence, str) in that list loop. — Tanmay Shrivastava, Jun 09 '21 at 18:27
[Catch the error](https://docs.python.org/3/tutorial/errors.html#handling-exceptions) and inspect/print relevant data in the except suite. When you inspected/printed values and/or conditions at various points in your program was there an obvious place where it was misbehaving? If you are using an IDE **now** is a good time to learn its debugging features. [What is a debugger and how can it help me diagnose problems?](https://stackoverflow.com/questions/25385173/what-is-a-debugger-and-how-can-it-help-me-diagnose-problems) — wwii, Jun 09 '21 at 18:40

score 0 · Accepted Answer · answered Jun 09 '21 at 18:26

0

The script is working well for the given example, but it seems that you have a float number in the 28-item X_train list, so I suggest converting sentences into String before splitting :

seq = [len(str(sentence).split()) for sentence in X_train]

answered Jun 09 '21 at 18:26

Ahmed Sabry

115
1
8

Thanks. I had to use str twice - both in str(X_train) and like you mentioned. But it worked. Thanks again. – K C Jun 09 '21 at 18:47

How to count the number of words in a sentence and append to a list

1 Answers1