0

When I use summary() function in R, it gives me correct output for continuous numerical variables, but it does not give me the frequency tables for character variables. My colleague uses the same code on the same data sets and gets the frequency tables. I have the most updated version of R. What can I do to get summary to work for me? (Note: it never works, no matter what data set I use, even data sets from tutorials on R.) This is the output I get:

summary(chem)
#>   plantid            species             deerpr             forest         
#> Length:104         Length:104         Length:104         Length:104        
#> Class :character   Class :character   Class :character   Class :character  
#> Mode  :character   Mode  :character   Mode  :character   Mode  :character  

This is the output she gets:

summary(chem)
#>    plantid   species       deerpr          forest         
#> B1     : 1   fagr:102   higher:45   baldpate  :34       
#> B10    : 1              lower :57   curlis    :10          
#> B100   : 1                          eames     :13                      
#> B12    : 1                          herrontown:23                   
#> B13    : 1                          rosedale  :22                                            
#> B15    : 1                                                                                   
#> (Other):96                                                                                 
Edo
  • 7,567
  • 2
  • 9
  • 19
  • 2
    Please add your code and an example dataset so we can try to reproduce your results. – pdw Nov 26 '20 at 16:00

2 Answers2

1

I suspect the reason the two of you are getting different results is because starting in R version 4.0.0, the default for stringsAsFactors changed to FALSE. Therefore, when you read the data from a .csv or other text file, your data is loaded as character vectors. In contrast, she has a version less than 4.0.0 and the data is automatically converted to type factor.

You can change this while reading in the data with stringsAsFactors = TRUE. For example:

data <- read.csv("data.csv",stringsAsFactors = TRUE)
Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
0

if you do this:

chem[] <- lapply(chem, as.factor)
summary(chem)

You will get the same result of your colleague.


As someone suggested in the comments (for some reason I don't see that comment anymore), it's likely that you have two different R versions, or anyway when you read your data you do it in two different ways.

Therefore, for your colleague all strings gets read as factors, while you keep them like strings.

Just convert all character variables to factor.

You can also do it this way, in case you have other columns that you don't want to transform to factor:

chem <- purrr::modify_if(chem, is.character, as.factor)
Edo
  • 7,567
  • 2
  • 9
  • 19