-1

I'm using R and need merge data with different lenghts

Following this dataset

> means2012
 # A tibble: 232 x 2
   exporter    eci
   <fct>     <dbl>
 1 ABW       0.235
 2 AFG      -0.850
 3 AGO      -1.40 
 4 AIA       1.34 
 5 ALB      -0.480
 6 AND       1.22 
 7 ANS       0.662
 8 ARE       0.289
 9 ARG       0.176
 10 ARM       0.490
 # ... with 222 more rows

> means2013
 # A tibble: 234 x 2
    exporter     eci
    <fct>      <dbl>
  1 ABW       0.534 
  2 AFG      -0.834 
  3 AGO      -1.26  
  4 AIA       1.47  
  5 ALB      -0.498 
  6 AND       1.13  
  7 ANS       0.616 
  8 ARE       0.267 
  9 ARG       0.127 
 10 ARM       0.0616
 # ... with 224 more rows


> str(means2012)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   232 obs. of  2 variables:
 $ exporter: Factor w/ 242 levels "ABW","AFG","AGO",..: 1 2 3 4 5 6 7 9 10 11 ...
 $ eci     : num  0.235 -0.85 -1.404 1.337 -0.48 ...
> str(means2013)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   234 obs. of  2 variables:
 $ exporter: Factor w/ 242 levels "ABW","AFG","AGO",..: 1 2 3 4 5 6 7 9 10 11 ...
 $ eci     : num  0.534 -0.834 -1.263 1.471 -0.498 ...

Note that 2 tibble has different lenghts. "Exporter" are countries.

Is there any way to merge both tibble, looking to the factors (Exporter) and fill the missing it with "na"?

It doesn't matter if is a tibble, dataframe, or other kind.

like this:

tibble 1
a 5
b 10
c 15
d 25

tibble 2
a 7
c 23
d 20

merged one:
a 5  7 
b 10 na
c 15 23
d 25 20
Adilson V Casula
  • 164
  • 1
  • 16
  • `left_join` will help. – MKR Feb 17 '18 at 19:41
  • MKR, got same error using the other dplyr solution: **Error: `by` must be a (named) character vector, list, or NULL for natural joins (not recommended in production code), not list** – Adilson V Casula Feb 17 '18 at 19:44
  • You should provide `by = exporter`. May be you should use `full_join` if you want to including missing values from both tibbles. – MKR Feb 17 '18 at 19:50

2 Answers2

1

using merge with parameter all set to TRUE:

tibble1 <- read.table(text="
x y
a 5
b 10
c 15
d 25",header=TRUE,stringsAsFactors=FALSE)

tibble2 <- read.table(text="
x z
a 7
c 23
d 20",header=TRUE,stringsAsFactors=FALSE)


merge(tibble1,tibble2,all=TRUE)

  x  y  z
1 a  5  7
2 b 10 NA
3 c 15 23
4 d 25 20

Or dplyr::full_join(tibble1,tibble2) for the same effect

moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
0

You could rename the colums to join them, and get NA where the other value is missing.

library(tidyverse)

means2012 %>% 
  rename(eci2012 = eci) %>% 
  full_join(means2013 %>% 
              rename(eci2013 = eci))

But a tidier approach would be to add a year column, keep the column eci as is and just bind the rows together.

means2012 %>% 
  mutate(year = 2012) %>% 
  bind_rows(means2013 %>% 
              mutate(year = 2013))
pasipasi
  • 1,176
  • 10
  • 8