5

I am trying to display multiple dataframes next to each other to compare certain entries. However, they have a different number of rows and I want each data frame to be in the exact same order. I tried to use cbind which did not work because of the different number of rows. I used merge to bind two dfs together and then merge them again, however they change order when I do that and it seems inefficient to merge two dfs when I have more than 5 in total.

Examp:

df <-  data.frame(v=1:5, x=sample(LETTERS[1:5],5))
df 
  v x
1 1 E
2 2 B
3 3 D
4 4 C
5 5 A

df2 <- data.frame(m=7:10, n=sample(LETTERS[6:9],4))
df2
   m n
1  7 G
2  8 I
3  9 F
4 10 H

Then I ordered df2

df2 <- df2[order(df2$m, decreasing = TRUE),]
df2
   m n
4 10 F
3  9 I
2  8 H
1  7 G

Expected output:

  v x m n
1 1 E 10 F
2 2 B 9 I
3 3 D 8 H
4 4 C 7 G
5 5 A NA NA

As I said, I have more than two dfs and the order of the dfs should be remained. Any help will be greatly appreciated!

Linda Espey
  • 145
  • 5

4 Answers4

5

Base R approach :

Put the dataframes in a list, get the dataframe with maximum number of rows, append NA's to data which have less number of rows and cbind.

list_df <- list(df, df2)
n_r <- seq_len(max(sapply(list_df, nrow)))
result <- do.call(cbind, lapply(list_df, `[`, n_r, ))
result

#  v x  m    n
#1 1 C 10    F
#2 2 B  9    H
#3 3 E  8    G
#4 4 D  7    I
#5 5 A NA <NA>
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
2

Edit: In case there are multiple df. Do this

  • Create a list of all dfs except one say first one
  • use purrr::reduce to join all these together
  • pass first df in .init argument.
df2 <- data.frame(m=7:10, n=sample(LETTERS[6:9],4))
df <-  data.frame(v=1:5, x=sample(LETTERS[1:5],5))
df3 <- data.frame(bb = 101:110, cc = sample(letters, 10))


reduce(list(df2, df3), .init = df %>% mutate(id = row_number()) , ~full_join(.x, .y %>% mutate(id = row_number()), by = "id" )) %>%
  select(-id)

    v    x  m    n  bb cc
1   1    A 10    I 101  u
2   2    C  9    H 102  v
3   3    D  8    G 103  n
4   4    E  7    F 104  w
5   5    B NA <NA> 105  s
6  NA <NA> NA <NA> 106  y
7  NA <NA> NA <NA> 107  g
8  NA <NA> NA <NA> 108  i
9  NA <NA> NA <NA> 109  p
10 NA <NA> NA <NA> 110  h

Earlier Answer: Create a dummy column id in both dfs and use full_join

full_join(df %>% mutate(id = row_number()), df2 %>% mutate(id = row_number()), by = "id") %>%
  select(-id)

  v x  m    n
1 1 A 10    I
2 2 C  9    H
3 3 D  8    G
4 4 E  7    F
5 5 B NA <NA>

Results are different from as expected becuase of different random number seed


Or in BaseR

merge(transform(df, id = seq_len(nrow(df))), transform(df2, id = seq_len(nrow(df2))), all = T)

  id v x  m    n
1  1 1 A 10    I
2  2 2 C  9    H
3  3 3 D  8    G
4  4 4 E  7    F
5  5 5 B NA <NA>

Remove extra column simply by subsetting []

merge(transform(df, id = seq_len(nrow(df))), transform(df2, id = seq_len(nrow(df2))), all = T)[-1]

  v x  m    n
1 1 A 10    I
2 2 C  9    H
3 3 D  8    G
4 4 E  7    F
5 5 B NA <NA>
AnilGoyal
  • 25,297
  • 4
  • 27
  • 45
  • I tried that already, however I still have the problem that I have multiple data frames that I want to put next to each other. And merge does not work as soon as I put more than two data frames in the command. Merging two by a time seems very inefficient. I just wondered whether there is a quicker way to do it – Linda Espey Apr 22 '21 at 06:46
  • See edited answer. If there is a column named `id` in any of the dataframes to be joined, use some another name say `dummy` which is not present in any of the dataframes. – AnilGoyal Apr 22 '21 at 06:56
0
library(plyr)
combined <- rbind.fill(df[c("v", "x")], df2[c("m", "n")])

Is that what you want?

Arduan
  • 253
  • 1
  • 11
0

Another base R approach, but with merge, you need to:

  • add the sort parameter to make sure the result doesn't get sorted
  • remove the row names from the data frames
  • add all parameter to make sure all rows are used,
  • the [-1] is to get rid of the row names column added by merge

Example:

set.seed(123)
df1 <-  data.frame(v = 1:5, 
                   x = sample(LETTERS[1:5], 5))
df1 
#>   v x
#> 1 1 A
#> 2 2 B
#> 3 3 D
#> 4 4 C
#> 5 5 E

df2 <- data.frame(m = 7:10, 
                  n = sample(LETTERS[6:9], 4))
df2
#>    m n
#> 1  7 G
#> 2  8 H
#> 3  9 I
#> 4 10 F

df2 <- df2[order(df2$m, decreasing = TRUE),]
df2
#>    m n
#> 4 10 F
#> 3  9 I
#> 2  8 H
#> 1  7 G

merge(data.frame(df1, row.names = NULL),
      data.frame(df2, row.names = NULL),
      by = 0,
      all = TRUE,
      sort = FALSE)[-1]
#>   v x  m    n
#> 1 1 A 10    F
#> 2 2 B  9    I
#> 3 3 D  8    H
#> 4 4 C  7    G
#> 5 5 E NA <NA>

If you need more then 2 dataframes you can use Reduce.

df3 <-  data.frame(a = 1:7, 
                   z = sample(LETTERS[1:7], 7))

Reduce(function(x,y) merge(x = x, y = y, by = 0, all = TRUE, sort = FALSE)[-1], 
       list(data.frame(df1, row.names = NULL), 
            data.frame(df2, row.names = NULL),
            data.frame(df3, row.names = NULL)))
#>    v    x  m    n a z
#> 1  1    C 10    I 1 F
#> 2  2    B  9    F 2 G
#> 3  3    E  8    H 3 A
#> 4  4    D  7    G 4 B
#> 5  5    A NA <NA> 5 C
#> 6 NA <NA> NA <NA> 6 D
#> 7 NA <NA> NA <NA> 7 E


Created on 2021-04-22 by the reprex package (v2.0.0)
ricoderks
  • 1,619
  • 9
  • 13