-1

I have a list of sentences say:

["Hello all, how are you doing?", "Hi all, wassup", "Namaste", "Bonjour, ca va", "Privet, kak dela?"...]

And I want to count the number of words per sentence and plot a histogram.

When I am counting individual items like:

    seq = []
    seq.append(len(X_train[0].split()))
    seq

It gives me the result, which is fine. But, when I try for the whole hello list sequence of 28 sentences:

seq = [len(sentence.split()) for sentence in X_train]

I get the following error:

ttributeError                            Traceback (most recent call last)
<ipython-input-100-d9dec14bd2dd> in <module>()
----> 1 num_words = [len(sentence.split()) for sentence in X_train]
      2 #pd.Series(seq_len).hist(bins = 30)

<ipython-input-100-d9dec14bd2dd> in <listcomp>(.0)
----> 1 num_words = [len(sentence.split()) for sentence in X_train]
      2 #pd.Series(seq_len).hist(bins = 30)

AttributeError: 'float' object has no attribute 'split'

I have no clue why. Can you please explain?

Thanks!

ThePyGuy
  • 17,779
  • 5
  • 18
  • 45
K C
  • 413
  • 4
  • 15
  • 2
    What is `X_train`? The error seems to indicate it is a sequence of `float`? – Cory Kramer Jun 09 '21 at 18:21
  • 1
    It looks like your sequence `for sentence in X_train` Has at least one float in it. Because of that, at some point `sentence` is a float, not a string, which has no definition for `split()` – AwesomeCronk Jun 09 '21 at 18:22
  • `sequence` is of type `float` and not `string`. you cannot use a string method on a float. – stefan_aus_hannover Jun 09 '21 at 18:22
  • 1
    Try str(sentence).split() or if you want to ignore those float entries from your list than put a condition if isinstance(sentence, str) in that list loop. – Tanmay Shrivastava Jun 09 '21 at 18:27
  • [Catch the error](https://docs.python.org/3/tutorial/errors.html#handling-exceptions) and inspect/print relevant data in the except suite. When you inspected/printed values and/or conditions at various points in your program was there an obvious place where it was misbehaving? If you are using an IDE **now** is a good time to learn its debugging features. [What is a debugger and how can it help me diagnose problems?](https://stackoverflow.com/questions/25385173/what-is-a-debugger-and-how-can-it-help-me-diagnose-problems) – wwii Jun 09 '21 at 18:40

1 Answers1

0

The script is working well for the given example, but it seems that you have a float number in the 28-item X_train list, so I suggest converting sentences into String before splitting :

seq = [len(str(sentence).split()) for sentence in X_train]

Ahmed Sabry
  • 115
  • 1
  • 8
  • Thanks. I had to use str twice - both in str(X_train) and like you mentioned. But it worked. Thanks again. – K C Jun 09 '21 at 18:47