2

when I use tidyr's gather() function and manipulate my dataframe, I lose row names of prev. data frame.

this is output of my rstudio console

> DF <- as.data.frame((freethrows/Games), row.names = rownames(Games), col.names = colnames(Games))
> head(DF)
                   2005     2006     2007     2008     2009     2010     2011     2012     2013     2014
KobeBryant     8.700000 8.662338 7.597561 5.890244 6.013699 5.890244 6.568966 6.730769 3.000000 5.600000
JoeJohnson     3.182927 4.122807 3.853659 3.784810 2.894737 2.708333 2.633333 1.833333 2.012658 1.762500
LeBronJames    7.607595 6.269231 7.320000 7.333333 7.802632 6.367089 6.241935 5.302632 5.701299 5.434783
CarmeloAnthony 7.162500 7.061538 6.025974 5.621212 7.362319 6.584416 5.363636 6.343284 5.961039 4.725000
DwightHoward   4.341463 4.756098 6.451220 6.379747 5.890244 7.000000 5.203704 4.671053 4.915493 3.487805
ChrisBosh      6.771429 6.710145 7.044776 6.545455 6.714286 4.987013 4.017544 3.256757 2.822785 4.068182
> DF_gathered <- DF %>%
+   gather('2005', '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014', key = 'year', value = 'freeThrowsPerGame')
> head(DF_gathered)
  year freeThrowsPerGame
1 2005          8.700000
2 2005          3.182927
3 2005          7.607595
4 2005          7.162500
5 2005          4.341463
6 2005          6.771429
> 

after I pipe my DF into gather() I expected the rownames to remain.

aynber
  • 22,380
  • 8
  • 50
  • 63
  • Please provide enough code so others can better understand or reproduce the problem. – Community Jan 25 '23 at 13:19
  • 2
    Turn the rows names to a column with `rownames_to_column()` – M Aurélio Jan 25 '23 at 13:22
  • The tidyverse does not think row names are a good idea so most functions will ignore or drop them. The tidyverse believes strongly that all data should be in a proper column. If you don't agree, then I just wanted to warn you that you will find yourself fighting with these functions often. – MrFlick Jan 25 '23 at 14:34

2 Answers2

1

as rownames must have unique values, I can't have rownames as rownames after gather(), so I use rownames_to_column() to have rownames as the 1st column and also after gather(), I can't use column_to_rownames("player"), due to unique values problem.

thanks @m-aurélio.

0

gather is superseded. Use pivot_longer instead. You should also include the rownames as a column, otherwise pivot_longer will not include them (as you see). If you really want to have rownames, you can still do tibble::column_to_rownames("player") afterwards.

You can do:

DF %>%
  tibble::rownames_to_column("player") %>%
  pivot_longer(-player, names_to = "year")

# A tibble: 60 × 3
   player     name  value
   <chr>      <chr> <dbl>
 1 KobeBryant 2005   8.7 
 2 KobeBryant 2006   8.66
 3 KobeBryant 2007   7.60
 4 KobeBryant 2008   5.89
 5 KobeBryant 2009   6.01
 6 KobeBryant 2010   5.89
 7 KobeBryant 2011   6.57
 8 KobeBryant 2012   6.73
 9 KobeBryant 2013   3   
10 KobeBryant 2014   5.6 
# … with 50 more rows
Maël
  • 45,206
  • 3
  • 29
  • 67