1

So I have this script (running Python 3.5) using Google API and Newspaper. It searches google for articles that have to do with sleep. And then using Newspaper, I iterate over those URLS. And all I'm asking Newspaper to do is return a list of keywords of that article, which I call by writing article.keywords .

for url in google.search('sleep', num=2, stop=1):
    article = Article(url)      
    article.download() 
    article.parse()
    article.nlp()     
    print(article.keywords)

The keywords that are returned (for a given article) look like this:

['education', 'nights', 'start', 'pill', 'supplement', 'research', 'national', 'sleep', 'sleeping', 'trouble', 'using', 'taking']

But I want to create a dictionary full of ALL of the keywords for all the results: That is, the keywords for each article that is being iterated over. How would I do that?

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62

2 Answers2

0

Assuming the dictionary key should be an article url:

keywords = {}
for url in google.search('sleep', num=2, stop=1):
    article = Article(url)      
    article.download() 
    article.parse()
    article.nlp()  

    keywords[url] = article.keywords

print(keywords)

Or, if you want to have a list of all the keywords from all the articles:

keywords = []
for url in google.search('sleep', num=2, stop=1):
    article = Article(url)      
    article.download() 
    article.parse()
    article.nlp()  

    keywords += article.keywords

print(keywords)
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
0

To prevent keywords being inserted multiple times (almost same as another answer)

keywords = []
for url in google.search('sleep', num=2, stop=1):
  article = Article(url)      
  article.download() 
  article.parse()
  article.nlp()
  for kw in article.keywords:
    if kw not in keywords:
      keywords.append( kw )

Or better yet, use a set instead of a list.

Sci Prog
  • 2,651
  • 1
  • 10
  • 18
  • What's the benefit of a set over a list? –  Feb 24 '16 at 17:35
  • With the set, you do not have to check if the element is already present. You can call the `add()` method with the same element multiple times. The disadvantage is the the order of the elements will be random (i.e. not in the order the elements were added), unlike a list. – Sci Prog Feb 25 '16 at 00:10