I am reading in a data frame from an online csv file, but the person who create the file has accidentally entered some numbers into column which should just be city names. Sample for cities.data
table.
City Population Foo Bar
Seattle 10 foo1 bar1
98125 20 foo2 bar2
Kent 98042 30 foo3 bar3
98042 Kent 30 foo4 bar4
Desired output after removing rows with only numbers in the city column:
City Population Foo Bar
Seattle 10 foo1 bar1
Kent 98042 30 foo3 bar2
98042 Kent 30 foo4 bar4
I want to remove the rows with ONLY numbers in the city column. Kent 98042 and 98042 Kent are both okay since it contains the city name, but since 98125 is not a city I remove that row.
I can't use is.numeric
because the number is being read as a string in the csv file. I tried using regex,
cities.data <- cities.data[which(grepl("[0-9]+", cities.data) == FALSE)]
But this deletes rows with any numbers rather than just the one containing only numbers, e.g.
City Population Foo Bar
Seattle 10 foo1 bar1
"Kent 98042"
was deleted even though I wanted to keep that row.
Suggestions? Please and thanks!