-1

I am a "newbie" when it comes to R, but i would really like to know how do i scrape multiple tables (that i don't know the dimensions of) from a site like:

https://en.wikipedia.org/wiki/World_population

(just to be specific, here's is what the code looks like in python:

from bs4 import BeautifulSoup
import urllib2

url1 = "https://en.wikipedia.org/wiki/World_population"
page = urllib2.urlopen(url1)
soup = BeautifulSoup(page)

table1 = soup.find("table", {'class' : 'wikitable sortable'})
trs = soup.find_all('tr')
tds = soup.find_all('td')

for row in trs:
    for column in tds:
        a = column.get_text().strip()
        print a
    break
hrbrmstr
  • 77,368
  • 11
  • 139
  • 205
KingMaker
  • 49
  • 2
  • 8
  • 1
    Welcome to SO! There are _scores_ of examples for this on SO for R. This is likely to get closed as a dup or "too broad" unless you have some R code to show that is not working. – hrbrmstr Oct 24 '15 at 09:52

1 Answers1

1

In R,

u <- "https://en.wikipedia.org/wiki/World_population" # input

library(XML)
b <- basename(u)
download.file(u, b)
L <- readHTMLTable(b)

L is now a list of the 29 tables in u, each as an R data frame.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341