I have the following dataframe:
library(rvest)
library(XML)
library(tidyr)
library(zoo)
library(chron)
library(lubridate)
library(stringr)
page.201702050atl = read_html("http://www.pro-football-reference.com/boxscores/201702050atl.htm")
comments.201702050atl = page.201702050atl %>% html_nodes(xpath = "//comment()")
pbp.201702050atl = comments.201702050atl[45] %>% html_text() %>% read_html() %>% html_node("#pbp") %>% html_table()
colnames(pbp.201702050atl) = c('Quarter', 'Time', 'Down', 'ToGo', 'Location', 'Detail', 'Away.Score', 'Home.Score', 'EPB', 'EPA', 'Win.pct')
pbp.201702050atl.a = pbp.201702050atl[-union(which(pbp.201702050atl$Quarter == '1st Quarter'), which(pbp.201702050atl$Quarter == 'Quarter')), ]
pbp.201702050atl.b = pbp.201702050atl.a[-union(which(pbp.201702050atl.a$Quarter == '2nd Quarter'), which(pbp.201702050atl.a$Quarter == '3rd Quarter')), ]
pbp.201702050atl.c = pbp.201702050atl.b[-union(which(pbp.201702050atl.b$Quarter == '4th Quarter'), which(pbp.201702050atl.b$Quarter == 'Overtime')), ]
pbp.201702050atl.d = pbp.201702050atl.c[-which(pbp.201702050atl.c$Quarter == 'End of Overtime'), ]
I would like to make a new dataframe that splits pbp.201702050atl.d$Location into two columns so that the character elements compose one and the numeric elements compose the other, like so:
V1 V2
1 "ATL" "35"
2 "NWE" "25"
3 "NWE" "34"
4 "NWE" "34"
5 "NWE" "34"
6 "NWE" "34"
7 "ATL" "34"
8 "ATL" "34"
9 "ATL" "34"
10 "" "50"
...
To do this, I've written:
Location.201702050atl = as.data.frame(str_split_fixed(as.character(pbp.201702050atl.d$Location), boundary("word"), n = 2))
While close to what I desire, this function results in:
V1 V2
1 "ATL" "35"
2 "NWE" "25"
3 "NWE" "34"
4 "NWE" "34"
5 "NWE" "34"
6 "NWE" "34"
7 "ATL" "34"
8 "ATL" "34"
9 "ATL" "34"
10 "50" ""
...
Notice Location.201702050atl[10,]. This function only places characters in Location.201702050atl$V2 if, for that row, the original column consists of two sets of characters seperated by a space. Instead, I would like to place similar (text) characters in Location.201702050atl$V1 and similar (numeric) characters in Location.201702050atl$V2. How do I split the elements of one column according to the natural format of its characters when the entire column, practically, has to be formatted identically, regardles of the natural format of its composing characters? Your help is much appreciated, thank you.