0

I am working with a data frame using R. I wanted to know if there is a way to make a summary table using the summary function. The regular summary function gives me separate summary statistics for each variable. I want to get a summary table showing the statistics for each variable without repeating min,max,median,... for each variable.

I also can get a summary of the wages for African American's and non-African Americans and want to combine the summary for both to the summary of the other variables already in the data frame. Any suggestions on how to accomplish this? I can do this easily using Stata but I am having trouble doing it with R.

Basically, how can I combine all three summaries into one nice looking table? Thanks

nonblkwage <- subset(data, black == 0, select = c(wage))
blackwage <- subset(data, black == 1, select = c(wage))
summary(nonblkwage)
summary(blackwage)
summary(data[,c("wage","KWW","educ","exper","black","urban","lwage")])

this is an example of the summary function output

     wage       
Min.   : 115.0  
1st Qu.: 702.5  
Median : 938.0  
Mean   : 990.6  
3rd Qu.:1200.0  
Max.   :3078.0  

the output would look something like this (they are fake numbers):

variable mean median min max  obs std.dev.
wage     4     4      0   10  50   30
educ     8     3      8   39  50   20
exper    10    29     2   60  30   8
...
...

I want to find a way to do this by making a function that takes data from the sum function and makes it into a table or a function that takes in columns of data, computes the corresponding summary statistic and then puts that info into a table.

******** Solution update **********

One more update for anyone who is also in the same situation as me. It might not be the most efficient way of doing this but it did do the trick. I used the following code to get the table below:

nonblkwage <- subset(data, black == 0, select = c(wage))
colnames(nonblkwage) <- c("nonblkwage")
blackwage <- subset(data, black == 1, select = c(wage))
colnames(blackwage) <- c("blackwage")
trimmed_basic_stat_table <- subset( basic_stat_table, select = 
c(wage,KWW,educ,exper,black,urban,lwage ) )
trimmed_basic_stat_table2 <- 
cbind(trimmed_basic_stat_table,basicStats(blackwage),basicStats(nonblkwage))
trimmed_basic_stat_table3 <- trimmed_basic_stat_table2[-c(2,5,6,9:13,15:16),]
final_summ_table <- round(trimmed_basic_stat_table3,4)




              wage      KWW     educ    exper    black    urban    lwage                                               
nobs     935.0000 935.0000 935.0000 935.0000 935.0000 935.0000 935.0000 
Minimum  115.0000  12.0000   9.0000   1.0000   0.0000   0.0000   4.7449 
Maximum 3078.0000  56.0000  18.0000  23.0000   1.0000   1.0000   8.0320
Mean     957.9455  35.7444  13.4684  11.5636   0.1283   0.7176   6.7790 
Median   905.0000  37.0000  12.0000  11.0000   0.0000   1.0000   6.8079 
Stdev    404.3608   7.6388   2.1967   4.3746   0.3346   0.4504   0.4211 

          blackwage nonblkwage
nobs      120.0000  815.0000
Minimum   260.0000  115.0000
Maximum   1874.0000 3078.0000
Mean      735.8417  990.6479
Median    683.5000  938.0000
Stdev     295.9309  408.0027
dmanmeds
  • 1
  • 1

0 Answers0