1

My data is structured as:

df <- data.frame(Athlete = c('02 Paul Jones', '02 Paul Jones', '02 Paul Jones', '02 Paul Jones',
                             '02 Paul Jones', '02 Paul Jones', '02 Paul Jones', '02 Paul Jones',
                             '01 Joe Smith', '01 Joe Smith', '01 Joe Smith', '01 Joe Smith',
                             '01 Joe Smith', '01 Joe Smith', '01 Joe Smith', '01 Joe Smith'),
                 Period = c('P1', 'P1', 'P1', 'P1',
                            'P2', 'P2', 'P2', 'P2',
                            'P1', 'P1', 'P1', 'P1',
                            'P2', 'P2', 'P2', 'P2'))
# Make `Athlete` column a character
df$Athlete <- as.character(df$Athlete)

How do I extract the first and last names of each athlete whilst keeping the space between first and last name? I do not want the leading space including either. For example, "Paul Jones" not " Paul Jones"

user2716568
  • 1,866
  • 3
  • 23
  • 38

2 Answers2

3

remove all except alphabets [:alpha:] and space characters [:space:] using POSIX locale type interpretation of regular expression pattern.

df$Athlete <- as.character(df$Athlete)  # convert factor to character

df$Athlete <- gsub("[^[:alpha:][:space:]]", '', df$Athlete) 
df$Athlete <- gsub("^[[:space:]]+", '', df$Athlete )  # removing leading spaces

head(df)
#       Athlete Period
# 1  Paul Jones     P1
# 2  Paul Jones     P1
# 3  Paul Jones     P1
# 4  Paul Jones     P1
# 5  Paul Jones     P2
# 6  Paul Jones     P2
Sathish
  • 12,453
  • 3
  • 41
  • 59
  • If I convert this back to a factor, the levels still show a blank space: `> levels(df$Athlete) [1] " Joe Smith" " Paul Jones"` – user2716568 Mar 22 '17 at 07:26
1

We can use sub to match one or more numbers ([0-9]+) followed by one or more space (\\s+) from the start (^) of the string and replace it with ""

df$Athlete <- sub("^[0-9]+\\s+", "", df$Athlete)
df
#      Athlete Period
#1  Paul Jones     P1
#2  Paul Jones     P1
#3  Paul Jones     P1
#4  Paul Jones     P1
#5  Paul Jones     P2
#6  Paul Jones     P2
#7  Paul Jones     P2
#8  Paul Jones     P2
#9   Joe Smith     P1
#10  Joe Smith     P1
#11  Joe Smith     P1
#12  Joe Smith     P1
#13  Joe Smith     P2
#14  Joe Smith     P2
#15  Joe Smith     P2
#16  Joe Smith     P2
akrun
  • 874,273
  • 37
  • 540
  • 662