I'm trying to scrape data from a Korean baseball league website, storing players' stats for each year.
http://www.koreabaseball.com/Record/Player/HitterDetail/Daily.aspx?playerId=79215 (It's in Korean, but I only need the numbers in table below, so it wouldn't matter)
If I pick a year from the dropdown box on the upper-right of the region showing the player's everyday stat, it automatically turns right into the desired page.
I've tried below:
library(httr)
library(rvest)
url <- "http://www.koreabaseball.com/Record/Player/HitterDetail/Daily.aspx?playerId=76249"
baseball <- POST(url, body =
list("ctl00$ctl00$ctl00$cphContents$cphContents$cphContents$ddlYear" = "2017"),
encode = "form")
page_2017 <- read_html(content(baseball, as="text", encoding="UTF-8"))
table <- html_nodes(page_2017, "tbody > tr > td")
table_text <- html_text(table)
record <- as.data.frame(matrix(table_text, ncol = 17, byrow = TRUE))
Problem is, I only get the same data from 2017 as below, even when I put other years within the POST function.
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17
1 03.31 한화 0.333 3 0 1 0 0 0 1 0 0 0 0 2 0 0.333
2 04.01 한화 0.000 4 0 0 0 0 0 0 0 0 0 0 3 0 0.143
3 04.02 한화 0.333 6 0 2 0 0 0 1 0 0 0 0 1 0 0.231
4 04.04 kt 0.400 5 0 2 1 0 0 1 0 0 0 0 1 0 0.278
I wish someone could help me on this matter and would really appreciate that. A general solution regarding dropdown boxes would be the best, but a specific one for this problem would also be appreciated.