1

I'm new to R and I don't know all basic concepts yet. The task is to produce a one merged table with multiple response sets. I am trying to do this using expss library and a loop.

This is the code in R without a loop (works fine):

#libraries
#blah, blah...

#path
df.path = "C:/dataset.sav"

#dataset load
df = read_sav(df.path)

#table
table_undropped1 = df %>%
  tab_cells(mdset(q20s1i1 %to% q20s1i8)) %>%
  tab_total_row_position("none") %>%
  tab_stat_cpct() %>%
  tab_pivot()

There are 10 multiple response sets therefore I need to create 10 tables in a manner shown above. Then I transpose those tables and merge. To simplify the code (and learn something new) I decided to produce tables using a loop. However nothing works. I'd looked for a solution and I think the most close to correct one is:

#this generates a message: '1' not found
for(i in 1:10) {
  assign(paste0("table_undropped",i),1) = df %>%
    tab_cells(mdset(assign(paste0("q20s",i,"i1"),1) %to% assign(paste0("q20s",i,"i8"),1)))
    tab_total_row_position("none") %>%
    tab_stat_cpct() %>%
    tab_pivot()
}

Still it causes an error described above the code.

Alternatively, an SPSS macro for that would be (published only to better express the problem because I have to avoid SPSS):

define macro1 (x = !tokens (1)
/y = !tokens (1))

!do !i = !x !to !y.

mrsets
/mdgroup name = !concat($SET_,!i)
variables = !concat("q20s",!i,"i1") to !concat("q20s",!i,"i8")
value = 1.

ctables
/table !concat($SET_,!i) [colpct.responses.count pct40.0].

!doend
!enddefine.

*** MACRO CALL.
macro1 x = 1 y = 10.

In other words I am looking for a working substitute of !concat() in R.

Soren V. Raben
  • 179
  • 1
  • 14

1 Answers1

0

%to% is not suited for parametric variable selection. There is a set of special functions for parametric variable selection and assignment. One of them is mdset_t:

for(i in 1:10) {
    table_name = paste0("table_undropped",i) 
    ..$table_name = df %>%
        tab_cells(mdset_t("q20s{i}i{1:8}")) %>% # expressions in the curly brackets will be evaluated and substituted 
        tab_total_row_position("none") %>%
        tab_stat_cpct() %>%
        tab_pivot()
}

However, it is not good practice to store all tables as separate variables in the global environment. Better approach is to save all tables in the list:

all_tables = lapply(1:10, function(i)
                    df %>%
                        tab_cells(mdset_t("q20s{i}i{1:8}")) %>% 
                        tab_total_row_position("none") %>%
                        tab_stat_cpct() %>%
                        tab_pivot()
                    )

UPDATE. Generally speaking, there is no need to merge. You can do all your work with tab_*:

my_big_table = df %>%
    tab_total_row_position("none")

for(i in 1:10) {
    my_big_table = my_big_table %>%
        tab_cells(mdset_t("q20s{i}i{1:8}")) %>% # expressions in the curly brackets will be evaluated and substituted 
        tab_stat_cpct() 
}

my_big_table = my_big_table %>%
    tab_pivot(stat_position = "inside_columns") # here we say that we need combine subtables horizontally
Gregory Demin
  • 4,596
  • 2
  • 20
  • 20
  • Thank you Gregory! I'll try your code soon and give an answer whether it works. Btw. why it is not a good practice to store all tables? – Soren V. Raben Feb 20 '20 at 15:56
  • @Konrad In the most cases it is inconvenient. You can compare: `all_tables[[i]]` and get( paste0("table_undropped",i)). Or, it is difficult to pass all your tables to the function: `my_proc_fun(all_tables)` vs ??? – Gregory Demin Feb 20 '20 at 16:01
  • Thank you. I've tried your code and it works perfect. Can I ask you a question about the next step (merging)? – Soren V. Raben Feb 20 '20 at 16:22
  • To merge all 10 tables into one I wrote this: `table_merged = all_tables[[1]] %merge% all_tables[[2]] %merge% all_tables[[3]] %merge% all_tables[[4]] %merge% all_tables[[5]] %merge% all_tables[[6]] %merge% all_tables[[7]] %merge% all_tables[[8]] %merge% all_tables[[9]] %merge% all_tables[[10]]` Do you have any I idea how to make it more neat? – Soren V. Raben Feb 20 '20 at 16:46
  • Okay, I've managed it finally. It's not sophisticated yet simple and works: `table1.merged = all.tables1[[1]] %merge% all.tables2[[2]] for(i in 3:9) { table1.merged = table1.merged %merge% all.tables1[[i + 1]] }` – Soren V. Raben Feb 20 '20 at 19:12
  • Impressive, thanks. Sadly, I don't understand much of it :D. I assume that especially `>%>` operator does the trick but I don't know how it exactly works. Still, as far as I can understand your algorithm is: 1. conversion of a data frame into an `expss` table 2. describing/'cutting' this data frame by declaration what variables should be included in a table (making subtables?) 3. finally *cutting* whole converted data frame by a command `tab_pivot()` Is this what you meant? – Soren V. Raben Feb 20 '20 at 21:02
  • @Konrad 1. `%>%` is very simple. It just place its left side as the first argument to the function on the right side. So `df %>% tab_total_row_position("none")` is identical to `tab_total_row_position(df, "none")`. When you see a lot of `%>%` it is just pleasant looking many nested function calls. 2. Each `tab_stat_*` calculate elementary subtable. It is stored inside resulted object. And finally we use `tab_pivot` to combine these subtables. By default `tab_pivot` stacked them vertically. And argument `stat_position = "inside_columns"` tells that we need stack subtables horizontally. – Gregory Demin Feb 20 '20 at 21:51