0

I'm trying to create a nested dictionary with the following format:

{person1:
         {tweet1 that person1 wrote: times that tweet was retweeted},
         {tweet2 that person1 wrote: times that tweet was retweeted},
 person2:
         {tweet1 that person2 wrote: times that tweet was retweeted},...
 }

I'm trying to create it from the following data structures. The following are truncated versions of the real ones.

 rt_sources =[u'SaleskyKATU', u'johnfaye', u'@anisabartes']
 retweets = [[], 
  [u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT',u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT'], []]
 annotated_retweets = {u'Stay safe #nyc #sandy http://t.co/TisObxxT':26}
 ''' 
     Key is a tweet from set(retweets) 
     Value is how frequency of each key in retweets
 '''

 for_Nick = {person:dict(tweet_record,[annotated_tweets[tr] for tr in tweet_record]) 
                                    for person,tweet_record in zip(rt_sources,retweets)}

Neither this SO question nor this one seem to apply.

Community
  • 1
  • 1
mac389
  • 3,004
  • 5
  • 38
  • 62
  • 1
    Please give actual example data and actual desired output. – Janne Karila Nov 27 '12 at 13:08
  • Why the downvote without a suggestion as to how I can improve the question? – mac389 Nov 27 '12 at 14:19
  • @JanneKarila Thanks. I edited my answer to fixe the `SyntaxError`. – mac389 Nov 27 '12 at 14:27
  • What happens with the current code? Does it raise an exception? If so, please include a traceback. If not, what is happening instead? Since you haven't provided any example data, it's impossible to test your code. – Blckknght Nov 27 '12 at 14:39

3 Answers3

1

It seems that a "person" and a "tweet" are going to be objects that have their own data, and functions. You can logically associate this idea by wrapping things up in a class. For example:

class tweet(object):
    def __init__(self, text):
        self.text = text
        self.retweets = 0
    def retweet(self):
        self.retweets += 1
    def __repr__(self):
        return "(%i)" % (self.retweets)
    def __hash__(self):
        return hash(self.text)

class person(object):
    def __init__(self, name):
        self.name = name
        self.tweets = dict()

    def __repr__(self):
        return "%s : %s" % (self.name, self.tweets)

    def new_tweet(self, text):
        self.tweets[text] = tweet(text)

    def retweet(self, text):
        self.tweets[text].retweet()

M = person("mac389")
M.new_tweet('foo')
M.new_tweet('bar')
M.retweet('foo')
M.retweet('foo')

print M

Would give:

mac389 : {'foo': (2), 'bar': (0)}

The advantage here is twofold. One, is that new data associated with a person or tweet is added in an obvious and logical way. The second is that you've created a nice user interface (even if you're the only one using it!) that will make life easier in the long run.

Hooked
  • 84,485
  • 43
  • 192
  • 261
0

Explicit is better than implicit says Guido

for_Nick = {}
for person,tweets in zip(rt_sources,retweets):
     if person not in for_Nick:
          for_Nick[person] = {}
          for tweet in list(set(tweets)):
               frequency = annotated_retweets[tweet]
               for_Nick[person][tweet] = frequency
     else: #Somehow person already in dictionary <-- Shouldn't happen
         for tweet in tweets:
             if tweet in for_Nick[person]:
                  current_frequency = for_Nick[person][tweet]
                  incoming_frequency = annotated_retweets[tweet]
                  for_Nick[person][tweet] = current_frequency + incoming_frequency
             else: #Person is already there but he said something new
                frequency = annotated_retweets[tweet]
                for_Nick[person][tweet] = frequency

Perhaps there are more elegant forms though.

mac389
  • 3,004
  • 5
  • 38
  • 62
0

This might be the dict comprehension you were trying to construct:

for_Nick = {person: 
               {tr: annotated_retweets[tr]
                for tr in set(tweet_record)} 
            for person, tweet_record in zip(rt_sources,retweets)}

You tried to pass a list of keys and a list of values to the dict constructor, which instead expects a list (or other iterable) of key-value pairs.

Janne Karila
  • 24,266
  • 6
  • 53
  • 94