0

I know it's impolite to include this much code in a question, but this is all necessary to explain the error. Apologies!

I'm writing a twitterbot, in python (twython), which is simply supposed to follow anybody who follows my account.

It reads in two text files, one of friends (people I'm following) and one of followers, using this code:

followers = []
friends = []
followers_old = []
friends_old = []

with open('followers_old.txt') as fo:
    followers_old=[L[:-1] for L in fo.readlines()]

with open('friends_old.txt') as fr:
    friends_old=[L[:-1] for L in fr.readlines()]

It then downloads the data from twitter, in chunks of 200. For each chunk, if the members are not already in the lists, it adds them, if any are already in the list, it doesn't append them, and it stops downloading chunks. This is done with the following code:

while(next_cursor):
        get_followers = twitter.get_followers_list(screen_name=username,count=200,cursor=next_cursor)
        time.sleep(60)
        for follower in get_followers["users"]:
                if follower not in followers_old:
                    followers.append(follower["screen_name"].encode("utf-8"))
                    next_cursor = get_followers["next_cursor"]
                else:
                    break

The above downloads followers. An identical bit of code does the same for friends.

Then, it should find members of 'followers' who are not members of 'friends', and follow them, using the following bit of code:

for fol in followers:
        if fol not in friends:
                twitter.create_friendship(fol)

Here is where I get an error. Twitter responds saying that I am trying to follow somebody I'm already following. I don't see how this can happen, given the 'if fol not in friends' line.

For those interested:

It finishes by appending the new followers and friends to the original text files read in at the start:

(For followers):

for fol in followers:
    fo.write("%s\n" % fol)

And then it does the same with friends.

Sorry for such a long question, I'd really appreciate the help.

Thanks, Alex.

EDIT:

Since there seems to be a bit of confusion caused by how I've summarised the question, here is the complete code:

twitter = Twython(CONSUMER_KEY,CONSUMER_SECRET,ACCESS_KEY,ACCESS_SECRET)

followers = [] 
friends = []
followers_old = []
friends_old = []

with open('followers_old.txt') as fo:
    followers_old=[L[:-1] for L in fo.readlines()]

with open('friends_old.txt') as fr:
    friends_old=[L[:-1] for L in fr.readlines()]

username = 'XXX'

next_cursor = -1
next_cursor_1 = -1

while(next_cursor):
        get_followers = twitter.get_followers_list(screen_name=username,count=200,cursor=next_cursor)
        time.sleep(60)
        for follower in get_followers["users"]:
                if follower not in followers_old:
                    followers.append(follower["screen_name"].encode("utf-8"))
                    next_cursor = get_followers["next_cursor"]
                else:
                    break

while(next_cursor_1):
        get_friends = twitter.get_friends_list(screen_name=username,count=200,cursor=next_cursor)
        time.sleep(60)
        for friend in get_friends["users"]:
                if friend not in friends_old:
                    friends.append(friend["screen_name"].encode("utf-8"))
                    next_cursor_1 = get_friends["next_cursor_1"]
                else:
                    break

for fol in followers:
        if fol not in friends:
                twitter.create_friendship(fol)

for fol in followers:
    fo.write("%s\n" % fol)

for fri in friends:
    fr.write("%s\n" % fri)

Based on the answers so far, I think the encoding is probably an issue. The files are initially blank, the script populates them entirely on its own, and updates them each time it runs.

I hope this makes it clearer, apologies for the ambiguity.

Alex
  • 2,270
  • 3
  • 33
  • 65
  • First, why are you using `readlines` here? `for L in fo.readlines()` does the same thing as `for L in fo`, except that it reads the entire file into memory and splits it into a big list in memory before looping, instead of reading things efficiently and feeding you a line at a time. – abarnert Oct 13 '14 at 23:03
  • Are you checking against the right list here: `if fol not in friends:`? I can see where you've filled the `friends_old` list, but in the code samples you've posted you don't seem to fill `friends`. – Marius Oct 13 '14 at 23:04
  • 1
    One thing: I can imagine that `L[:-1]` is there to remove the trailing newline. Instead of that, do `L.strip()` ;-) Oh, and do what @abarnert says! – Ricardo Cárdenes Oct 13 '14 at 23:05
  • Also, you almost certainly want sets here, not lists; otherwise, you're searching the list exhaustively for every follower, which is not only slower, but also more complicated. (With a set, you don't even need the search. – abarnert Oct 13 '14 at 23:05
  • 1
    Also, you seem to be mixing Unicode and 8-bit values. Why is `followers` UTF-8-encoded `bytes`, while `followers_old` is Unicode? Or… is it actually Unicode? If this is Python 2.x, you could easily screw things up by treating 8-bit data as if it were Unicode or vice-versa. For example, `'åbcdé' in [u'åbcdé']` is False. – abarnert Oct 13 '14 at 23:09
  • @Marius For the sake of making the question shorter, I omitted it, but there is an identical bit of code that fills friends. – Alex Oct 13 '14 at 23:14
  • If you want us to debug this, you are going to have to show us what's in each of these lists (ideally from a stripped-down run with only about 3 or 4 of them instead of hundreds), by `print` or `logging` or by running in the debugger or whatever you prefer. – abarnert Oct 13 '14 at 23:16
  • Also, please tell us whether this is Python 2.x or 3.x, and what platform you're on. – abarnert Oct 13 '14 at 23:19

3 Answers3

1

There are at least two problems in your code, and I don't know which one is causing your error (if not both). I'll explain one here; see this answer for the other.

if follower not in followers_old:
    followers.append(follower["screen_name"].encode("utf-8"))

It looks like some of your lists are of UTF-8 byte strings, while others are in Unicode strings. In Python 2.x, you can sometimes get away with mixing and matching, as long as you stick to strings that happen to be the same in your default charset and UTF-8 (generally, that means pure ASCII), but it will break as soon as you violate that.

For example:

>>> friends = [u'åbcdé']
>>> follower = 'åbcdé'
>>> follower in friends
False

So, you will try to follow people who are already your friends because you will think nobody is already a friend, unless their name is pure ASCII.

Community
  • 1
  • 1
abarnert
  • 354,177
  • 51
  • 601
  • 671
0

If you want to find the elements in followers that are not in friends use sets:

followers = ["foo","bar","foobar"]
friends = ["foo","foo1","bar1"]
print(set(followers).difference(friends))

set(['foobar', 'bar'])
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • 3
    Great for efficiency, but probably not the cause of the issue OP is having? – Marius Oct 13 '14 at 23:09
  • @Marius, without seeing more code it is impossible to tell but if the OP has two lists then the above code should work 100 percent. – Padraic Cunningham Oct 13 '14 at 23:12
  • @PadraicCunningham: And if the OP has two lists, his own code should also work 100 percent. The sets are more efficient, and simpler, but the overall logic is the same, so the result can't be any different, so this can't be the problem. – abarnert Oct 13 '14 at 23:16
  • @abarnert, well there is nothing in the code to diagnose what could be wrong so the OP may as well know how to efficiently compare two lists – Padraic Cunningham Oct 13 '14 at 23:19
  • 1
    @PadraicCunningham: A comment on how to improve his code is not an answer, it's a comment. – abarnert Oct 13 '14 at 23:20
  • @abarnert, I am not going to post a complete example as a comment, comments are for commenting not lines of code. I could argue that your answer is based of a guessing game as there is not enough detail provided to actually know what the problem is – Padraic Cunningham Oct 13 '14 at 23:22
  • 1
    @PadraicCunningham: My answer is an attempt to answer the question and solve the problem. If I've guessed wrong, it may be the wrong answer. But your answer is not an answer at all; there's no way it could possibly solve the problem. – abarnert Oct 13 '14 at 23:26
0

There's another problem in your code, in addition to the Unicode problem.

  • followers_old is the followers from the previous run.
  • followers is only the new followers that weren't there last time.
  • friends_old is the friends from the previous run.
  • friends is only the new friends that weren't there last time.

So, when you do this:

for fol in followers:
    if fol not in friends:
        twitter.create_friendship(fol)

You're only skipping new followers that aren't new friends. But you need to also skip new followers that aren't old friends. In other words, you need:

    if fol not in friends and fol not in friends_old:

Or, better, you need a single list that combines the two. Even better if it's a set rather than a list, and even better if you just iterate over the set difference instead of checking each one by one:

for fol in set(followers) - set(friends) - set(friends_old):
    twitter.create_friendship(fol)
Community
  • 1
  • 1
abarnert
  • 354,177
  • 51
  • 601
  • 671