When I do
import newspaper
paper = newspaper.build('http://cnn.com', memoize_articles=False)
print(len(paper.articles))
I see that newspaper found 902 articles from http://cnn.com, which seems quite little too me, considering that they publish many articles per day and has published articles online for many years. Are these really all articles there is on http://cnn.com? If not, is there any way I can find the URLs of the rest of the articles too?