0

There are a lot of posts regarding parsing twitter json, but none that I have seen solved my problem.

This is code

import json

file = open('tweet', 'r')
tweet = file.read()
#{"geo":null,"text":"Lmao!! what time? I dont finish evening cleaning till 5 RT \u201c@some_user: football anyone?.....i wanna have a kickabout :(\u201d"}
#{"geo":null,"text":"Lmao!! what time? I dont finish evening cleaning till 5 RT @some_user: football anyone?.....i wanna have a kickabout :("}
def parseStreamingTweet(tweet):
    try:
        singleTweetJson = json.loads(tweet)
        for index in singleTweetJson:
            if index == 'text':
                print "text : ", singleTweetJson[index]
    except ValueError:
        print "Error ", tweet
        print ValueError
        return

parseStreamingTweet(tweet)

This is test program. Tweet comes in stream and for checking purpose, I have saved a tweet in a file and checked. There is a edited part of twitter feed.

Can anyone say me how to parse the tweet that are uni-coded. The first tweet in the comment is uni-coded and second one is not. There is error in first, while removing the uni-code string, the parsing is successful. What can be the solution?

Curiousity
  • 83
  • 2
  • 10

1 Answers1

2

I think your code works, the reason for the error is probably because of a UnicodeEncodeError which happens when you try to print the unicode value to the terminal. I'm guessing you are calling the script in a non-unicode aware terminal. If instead you printed the repr of the unicode value, or (wrote it to an output file) it would probably work:

print "text : ", repr(singleTweetJson[index])

Also its generally bad practice to hide specific exceptions/error messages with generic catch-all exceptions/error messages.

Preet Kukreti
  • 8,417
  • 28
  • 36
  • thanks! this worked! but I printed in unicode supported terminal (I am using Netbeans IDE, and that supports unicode.) and the message got printed in exception but not in array 'text' anyway it worked!!! thanks! – Curiousity Mar 10 '12 at 16:15