1

I have a data frame with firstname, lastname and I want to permutate them but ONLY for the rows that have values. There are many null fields and I don't want reorder them so that there is ever a firstname value without a lastname value. Ex:

number<- c(1,2,3,4,5)
firstname<- c('','Eddie','Edward','','Edurardo')
lastname <- c('','Vedder', 'Van Halen', '', 'Norton')
permtest <- data.frame(number,firstname,lastname)
permtest
  number firstname  lastname
1      1                    
2      2     Eddie    Vedder
3      3    Edward Van Halen
4      4                    
5      5  Eduardo    Norton

Expected results would be:

  • Eddie Norton
  • Edward Vedder
  • Eduardo Van Halen

But not:

  • Eddie _____

or:

  • ______ Van Halen

I tried the transform function but it didn't work:

permtest2 <- transform( permtest2, firstname = sample(firstname,lastname) )

3 Answers3

1

What helps is shuffling only the nonempty entries:

permtest$lastname[permtest$lastname != ''] <- sample(permtest$lastname[permtest$lastname != ''])
permtest
#   number firstname  lastname
# 1      1                    
# 2      2     Eddie Van Halen
# 3      3    Edward    Vedder
# 4      4                    
# 5      5  Edurardo    Norton
Julius Vainora
  • 47,421
  • 9
  • 90
  • 102
0

One of many ways to do this:

permtest[permtest == ''] = NA

library(dplyr)

permtest %>% 
  dplyr::filter(!is.na(lastname), !is.na(firstname)) %>% 
  dplyr::mutate(val = paste(firstname, lastname)) %>% 
  dplyr::pull(val)

[1] "Eddie Vedder"     "Edward Van Halen" "Edurardo Norton" 
astrofunkswag
  • 2,608
  • 12
  • 25
  • I'm sure this is a valid answer too, but I received this (probably unrelated) error: Error: 'pull' is not an exported object from 'namespace:dplyr' – Jerid C. Fortney Dec 12 '18 at 14:51
  • Did you mean `dat[is.na(dat)] = ''`? That's what I used along with Julius' code (above)' after getting an error and, together, it seems to work. – Jerid C. Fortney Dec 12 '18 at 16:14
  • `pull` is definitely a [dplyr function](https://www.rdocumentation.org/packages/dplyr/versions/0.7.8/topics/pull). I'm not sure why that error would be occuring. – astrofunkswag Dec 12 '18 at 18:38
  • That first line of code converts `' '` to `NA` so I can filter it out with the `!is.na` command. The comment you added does the reverse, and converts `NA` to `' '`, but you don't initially have any `NA` in your dataframe so that won't do anything – astrofunkswag Dec 12 '18 at 18:44
0

Using tidyverse you could do

library(tidyverse)
library(stringr)

permtest2 <- permtest %>% mutate(Nfname = str_length(firstname)) %>% filter(Nfname > 0) %>% mutate(lastname = sample(lastname, size = length(lastname))) %>% select(-Nfname)
Derek Corcoran
  • 3,930
  • 2
  • 25
  • 54
  • I'm new at R, but I feel like this changes the dimensions of permtest2. I still need it to be 5 rows in length (keep the rows with blanks). – Jerid C. Fortney Dec 12 '18 at 13:54