-4

i have to write a function count_words() that takes a list of strings and returns the int number of distinct words in that list in the form of an int. The list is like:

List = ['twas', 'brillig', 'and', 'the', 'slithy', 'toves', 'did', 'gyre',
        'and', 'gimble', 'in', 'the', 'wabe', 'all', 'mimsy']

I have tried doing it using this code:

def count_words(url): #this is the first line of the code but it was not included with the lines below for some reason.

    mylist = function(url)  #the function function(url) reads through the url and returns all words from the website in a list of strings. 
    counts = 0
    for i in mylist:
        if i not in mylist:
            counts = counts + 1
        else:
            continue
        return counts

from here I do not know what to do. I am getting an error for the line that says 'for i in mylist' and i dont know how to fix it. I am a beginner still so the very basic answers will do. I do not mind if i have to change me whole code. the only thing i can not change is the 'mylist = function(url)' line because that part works and we have to include it.

The error i get back is:

 Traceback (most recent call last):
    File "<web session>", line 1, in <module>
    File "/home/karanyos/foc/proj1-karanyos/karanyos.py", line 24, in count_words
       for i in mylist:
 TypeError: 'NoneType' object is not iterable

Thanks in advance,

Keely

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
Keely Aranyos
  • 209
  • 1
  • 4
  • 7
  • 3
    Why did you edit your answer to make your code incorrectly formatted, *after I formatted it correctly*? – Marcin Apr 10 '12 at 11:40
  • what is the error message then? always copy the complete error message if you want to ask why your program is considered erreneous – gefei Apr 10 '12 at 11:43
  • This is the error i get for that line: Traceback (most recent call last): File "", line 1, in File "/home/karanyos/foc/proj1-karanyos/karanyos.py", line 22, in count_words for word in mylist: TypeError: 'NoneType' object is not iterable – Keely Aranyos Apr 10 '12 at 11:46
  • then it seems that the call "function(url)" is in fact wrong ("function" is a very bad name of a function, BTW). I guess in this function there is no "return"? – gefei Apr 10 '12 at 11:48
  • Your error message means exactly what it says: `mylist` is a `NoneType` kind of thing, i.e. it is `None` (which is not `iterable`, i.e. usable as a thing to iterate over with a `for` loop). That is because `function(url)` returned `None`. You could have trivially verified this for yourself with `print mylist` or similar, and seen that you don't actually have what you expected. – Karl Knechtel Apr 10 '12 at 11:51
  • @user1180720: Keely hasn't shown us the full code. She's omitted the definition of `function()` because it's not relevant here, which is generally a good thing - it makes it more clear which part she's having trouble with. – Li-aung Yip Apr 10 '12 at 11:51
  • i changed the name of function() and there is a return isnt there?? isnt it return counts?? – Keely Aranyos Apr 10 '12 at 11:51
  • @KeelyAranyos Look at this: http://stackoverflow.com/posts/10087933/revisions Your code was not formatted; I formatted it; you broke it; Li-aung Yip corrected it again. – Marcin Apr 10 '12 at 11:52
  • @KeelyAranyos: your `count_words()` function returns `counts`, but your `function()` function isn't returning anything. Quite likely this is because there's a bug in `function()` or because you provided a bad `url` (maybe the `url` is 404). Without seeing what's inside `function()` we can't say more. – Li-aung Yip Apr 10 '12 at 11:55
  • when i put the url our lecturer gave us into function() it gives me back a list of strings. each string is a word as the url we were given was a poem. the function also strips off all punctuation from each string and makes all letters lower case. when i try this function by its self its works fine. – Keely Aranyos Apr 10 '12 at 12:01
  • By 'gives me back' do you mean `function()` *returns* a list of strings, or *prints* a list of strings? As @KarlKnechtel has said, the error message implies that `function()` returns `None`. (Also, please tell your lecturer not to use `function` as a function name. It's horrible.) – Li-aung Yip Apr 10 '12 at 12:12
  • yes i realise that function is a terrible name and that was my name not my lecturer's but i changed it. when i print function() it prints the list of strings and then says None – Keely Aranyos Apr 10 '12 at 12:15
  • 1
    That indicates to me that `function()` is not actually *returning* anything - it's just printing the words to the screen and then throwing them away. – Li-aung Yip Apr 10 '12 at 12:25
  • 1
    "when i print function() it prints the list of strings and then says None": a very clear sign that function only prints the list of strings, and does not return anything. you do need a "return" statement in your "function" function – gefei Apr 10 '12 at 12:27
  • i changed the print statement to a return statement and now when i print the function() it returns the list of strings – Keely Aranyos Apr 10 '12 at 12:35
  • and now your count_words is supposedly executable without error---although its logic is wrong. but for that just consult the answers below – gefei Apr 10 '12 at 12:42
  • when i put in count_words() it returns 0 – Keely Aranyos Apr 10 '12 at 12:47
  • yes, that's a logic problem of your code. just study the answers to fix it – gefei Apr 10 '12 at 13:01
  • i have looked at the answers and they dont really help me, i dont understand them – Keely Aranyos Apr 10 '12 at 13:27
  • Hint: consider what your loop is *doing*. Step through it with a debugger if you have to. Then consider the logical truth value of `'the' in ['the','jabberwocky',...]`. Finally, explain why your current logic never reaches the line `count = count + 1`. – Li-aung Yip Apr 10 '12 at 13:41
  • i changed if i not in mylist: to if i in mylist == False thinking that might work but then it still will not work, im really confused and cannot figure out how to change it so it works – Keely Aranyos Apr 10 '12 at 13:48
  • i had a look at the collections module and kind of got how it worked but when i tried to use it within my code it did not work. also my final answer needs to be a int. the int being the number of distinct words in it – Keely Aranyos Apr 10 '12 at 13:52
  • `x == False` and `not(x)` have exactly the same logical meaning. Changing the syntax of the logical test doesn't change the fact your underlying logic is wrong. Again: see my edited answer. – Li-aung Yip Apr 10 '12 at 13:52
  • then im not sure how to change it so it goes through the list properly – Keely Aranyos Apr 10 '12 at 14:00
  • My advice is this: 1) Think about how you would do it on paper. 2) Write down all the steps you would take to do it using pen and paper. 3) **Then** try and implement it in code. Also see the fourth and final edit to my answer - I think I've given you enough hints. You're on your own now! – Li-aung Yip Apr 10 '12 at 14:08

4 Answers4

3

Hint: use the collections module.

As for your code, some additional hints on style and other matters:

  • Don't use the word function as the name for a function. function is a "special" word and using it as a plain old function name shadows its special meaning.
  • Don't use single-letter names (i) for loop variables** - use a descriptive name. Here for word in mylist: would be appropriate.
  • Your code has a logic error - if a word appears in the list, by definition word in list == True. So counter will never get past zero.

** Sidenote: single-letter variable names are bad style because they provide no information about what the variable means, or what it's supposed to contain. I personally only consider n, m, p and i, j, k to be acceptable loop variable names in mathematical code, and only then when used in the same way mathematicians use n,m,p i,j,k. This is for historical reasons.


A hint towards finding your logic error:

# Relevant part of your code
my_list = ['a','b','c','d']
for item in my_list:
    if item in my_list:
        print "item %s in list" % item
    else:
        print "item %s not in list" % item

The output is:

item a in list
item b in list
item c in list
item d in list

This is because the code above is a tautology: You're taking a value from a list, and them immediately asking if that value occurs in that list. The answer is always going to be "yes".

This is not really the logical test you wanted. What you really want to do is keep track of the words you've already seen. Maybe you need some way of keeping track of which words you've already seen? Or possibly you just need a magical piece of code which will keep track of all the unique words you've seen? (Hint: look in the collections module.)

Generally speaking, you would also be well served by learning how to use a debugger. This will let you see into the intermediate states of the program as it executes. Spyder is a Python IDE with pdb debugger integration (and a lot of other nice features.) Check it out.


Edit 4: You mention that you tried using the collections module - good on you! - but that the output was unsuitable because you "need to return an int".

Meditate on this:

>>> import collections
>>> my_string = "abc aabc ccab a acbbbaa"
>>> my_counter = collections.Counter(my_string)
>>> my_counter
Counter({'a': 8, 'b': 6, 'c': 5, ' ': 4}) 
>>> my_counter.keys() # Get a list of unique things in the counter
['a', ' ', 'c', 'b']
>>> 

Do you know how to determine how many things are in a list?

Hint 2: You can see the attributes of an object by calling dir() on it. If you don't know what you are allowed to do to an object, or what methods you can call on an object, do this to find out:

>>> dir(my_counter)
['__add__', '__and__', '__class__', '__cmp__', '__contains__', '__delattr__',
 '__delitem__', '__dict__', '__doc__', '__eq__', '__format__', '__ge__',
 '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__',
 '__iter__', '__le__', '__len__', '__lt__', '__missing__', '__module__',
 '__ne__', '__new__', '__or__', '__reduce__', '__reduce_ex__', '__repr__',
 '__setattr__', '__setitem__', '__sizeof__', '__str__', '__sub__',
 '__subclasshook__', '__weakref__', 'clear', 'copy', 'elements', 'fromkeys',
 'get', 'has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 'keys',
 'most_common', 'pop', 'popitem', 'setdefault', 'subtract', 'update', 'values',
 'viewitems', 'viewkeys', 'viewvalues']
Community
  • 1
  • 1
Li-aung Yip
  • 12,320
  • 5
  • 34
  • 49
  • ok i changed the (i) to word and used the True thing but the same error still happens – Keely Aranyos Apr 10 '12 at 11:48
  • And what is that error message? When you say "I get an error", **post the error message**; this helps us figure out what's going on. (Don't post the error message in comments; instead, edit your original post to add the new information.) – Li-aung Yip Apr 10 '12 at 11:50
  • the error message is above in the answer above but if u want me to post it here let me know – Keely Aranyos Apr 10 '12 at 12:02
  • For future reference - the correct way to post new information is to **edit your original question.** The reasons for this include: 1) it's easy to lose information in the comments, 2) long discussions in the comments are hard to keep track of, and 3) future answers (and readers!) will find the question more informative if all the information is in one place. – Li-aung Yip Apr 10 '12 at 12:15
  • ok will do next time :) and put the error i got in the question. – Keely Aranyos Apr 10 '12 at 12:17
2

(1) The collections library has a class which allows you to do just this.

(2) If you want to implement this functionality yourself, just use a set, and take its len.

Marcin
  • 48,559
  • 18
  • 128
  • 201
1

Back to basics,

If you are using an IDE (say IDLE) learn how to debug a code. You can start dirtying your hand by using pdb

Sometimes just logging with simple print statement would be enough to figure out the root cause.

  1. What is the value of the variable mylist just after calling function(url)
  2. What does the error message say? Do you see something like TypeError: 'NoneType' object is not iterable?

Solving your problem. People coming from other languages seldom don't get used to the data-structure and libraries that Python provides.

So you know there is something called set which will generate a unique list of items from a duplicate list? Do you know there is a python built-in function len which returns the length of an Object?

If you still face issues getting this done. Please start over An informal Introduction to Python

Abhijit
  • 62,056
  • 18
  • 131
  • 204
  • the value of the variable mylist returns a list of strings. That is the exact error im seeing. – Keely Aranyos Apr 10 '12 at 12:04
  • @KeelyAranyos: You didn't state the exact error message. The Error Message may look like ` Traceback (most recent call last): File "?????>", line ?, in for i in mylist: TypeError: Some Error Message`. And Please mention in your question, what you see when you do `print function(url)` – Abhijit Apr 10 '12 at 12:07
  • we were given the url http://ww2.cs.mu.oz.au/~tim/jabberwocky.txt when i put that into function() like this: function("http://ww2.cs.mu.oz.au/~tim/jabberwocky.txt") I got back: ['twas', 'brillig', 'and', 'the', 'slithy', 'toves', 'did', 'gyre', 'and', 'gimble', 'in', 'the', 'wabe', 'all', 'mimsy', 'were', 'the', 'borogoves', 'and', 'the', 'mome', 'raths', 'outgrabe', '"beware', 'the', 'jabberwock', 'my', 'son', 'the', 'jaws', 'that', 'bite', 'the', 'claws', 'that', 'catch', 'beware', 'the', 'jubjub', 'bird', 'and', 'shun', 'the', 'frumious'] #the list is a lot longer but it would not fit here – Keely Aranyos Apr 10 '12 at 12:10
  • so? whats the O/P of `print function('ww2.cs.mu.oz.au/~tim/jabberwocky.txt')` – Abhijit Apr 10 '12 at 12:11
  • check my post about your last one – Keely Aranyos Apr 10 '12 at 12:16
  • and what happened when you do `print count_words('ww2.cs.mu.oz.au/~tim/jabberwocky.txt')`. Please mention what ever you see in your console. – Abhijit Apr 10 '12 at 12:17
  • it returns the lsit of strings and then Traceback (most recent call last): File "", line 1, in File "/home/karanyos/foc/proj1-karanyos/karanyos.py", line 21, in count_words for word in mylist: TypeError: 'NoneType' object is not iterable – Keely Aranyos Apr 10 '12 at 12:29
  • Looks like `function('ww2.cs.mu.oz.au/~tim/jabberwocky.txt')` is not returning the list of words but rather printing it on the screen. As your lecturer to change the function such that it returns the list rather than printing it on the screen – Abhijit Apr 10 '12 at 12:34
  • i changed it and now when i put in function('ww2.cs.mu.oz.au/~tim/jabberwocky.txt') it returns the list of functions but when i do count_words("ww2.cs.mu.oz.au/~tim/jabberwocky.txt") it returns 0 – Keely Aranyos Apr 10 '12 at 12:39
0
import collections

collections.Counter(['twas', 'brillig', 'and', 'the', 'slithy', 'toves', 'did', 'gyre', 'and', 'gimble', 'in', 'the', 'wabe', 'all', 'mimsy'])

This would return

s = Counter({'and': 2, 'the': 2, 'slithy': 1, 'brillig': 1, 'gyre': 1, 'gimble': 1, 'did': 1, 'in': 1, 'all': 1, 'toves': 1, 'mimsy': 1, 'twas': 1, 'wabe': 1})

You can easily get your result from here

>>> count = 0 
>>> for a in s:
...     if s[a] == 1:
...         count = count + 1
>>> print count
Zain Khan
  • 3,753
  • 3
  • 31
  • 54
  • this would work but i need the function to return an int, for this list for example i need it to return the number 13(i think thats how many distinct words there are) – Keely Aranyos Apr 10 '12 at 11:50
  • when i try using collections.counter it gives me an error: Traceback (most recent call last): File "", line 1, in AttributeError: 'module' object has no attribute 'Counter' I have imported it already though – Keely Aranyos Apr 10 '12 at 13:24
  • collections.Counter() is the command if you are using import collections – Zain Khan Apr 10 '12 at 14:27
  • yeah i have been using that but when it try it gives me an error: >>> collections.Counter(mylist) Traceback (most recent call last): File "", line 1, in AttributeError: 'module' object has no attribute 'Counter' – Keely Aranyos Apr 11 '12 at 00:57