0

I have two dataframes that are very similar that I am trying to rbind together, but am running into an issue. I used dput() to grab 3 columns (one of which is problematic) and 10 rows from each dataframe.

str1 = structure(list(period_type = c("half", "half", "half", "half", 
                               "half", "half", "half", "half", "half", "half"), period_number = c(1L, 
                                                                                                  1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), clock = structure(c("72000", 
                                                                                                                                                           "70800", "69720", "69600", "69480", "68280", "67200", "66780", 
                                                                                                                                                           "65160", "65160"), class = c("hms", "difftime"), units = "secs")), row.names = c(NA, 
                                                                                                                                                                                                                                            10L), class = "data.frame")

str2 = structure(list(period_type = c("half", "half", "half", "half", 
                               "half", "half", "half", "half", "half", "half"), period_number = c(1L, 
                                                                                                  1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), clock = structure(c(72000, 
                                                                                                                                                           71640, 70140, 70020, 69720, 69720, 69720, 69720, 69300, 67860
                                                                                                  ), class = c("hms", "difftime"), units = "secs")), row.names = c(NA, 
                                                                                                                                                                   10L), class = "data.frame")

> head(plyr::rbind.fill(str1, str2))
  period_type period_number      clock
1        half             1 NA:NA:NANA
2        half             1 NA:NA:NANA
3        half             1 NA:NA:NANA
4        half             1 NA:NA:NANA
5        half             1 NA:NA:NANA
6        half             1 NA:NA:NANA

When I perform rbind.fill the clock column turns into NA:NA:NANA, which is frustrating. When I check the classes for the clock column in each dataframe, they "appear" to be the same:

> class(str1$clock)
[1] "hms"      "difftime"
> class(str2$clock)
[1] "hms"      "difftime"

...however, what the dput() has fortunately revealed to me is that the values in the clock vector are strings for str1 and numbers for str2. Again I did not create these demo str dataframes from scratch, they are from my full dataframes, so this is clearly a different in the clock column between dataframes.

How can I fix either of these so that the column types are consistent? Thanks in advance!!

Canovice
  • 9,012
  • 22
  • 93
  • 211
  • 1
    Can you check your sample data? I get an `unexpected end of input` error with some very strange formatting of the data. – Maurits Evers May 02 '19 at 23:49
  • can anybody else verify this? The `dput` of the data is not the cleanest, and drags right, so make sure you've copy/pasted the entirely of the dput? can you double check this @MauritsEvers – Canovice May 03 '19 at 01:30
  • @MauritsEvers i tried on my end and the data appeared fine on my end – Canovice May 03 '19 at 01:31
  • I see; indeed it seems I must've missed the last line. It's working now (better formatted data would've still helped). I assume `hms` is from package `hms`? – Maurits Evers May 03 '19 at 01:46

1 Answers1

1

This is not really an explanation why plyr::rbind.fill did not work, but the following does work

library(hms)
do.call(rbind, list(str1, str2))
#   period_type period_number    clock
#1         half             1 20:00:00
#2         half             1 19:40:00
#3         half             1 19:22:00
#4         half             1 19:20:00
#5         half             1 19:18:00
#6         half             1 18:58:00
#7         half             1 18:40:00
#8         half             1 18:33:00
#9         half             1 18:06:00
#10        half             1 18:06:00
#11        half             1 20:00:00
#12        half             1 19:54:00
#13        half             1 19:29:00
#14        half             1 19:27:00
#15        half             1 19:22:00
#16        half             1 19:22:00
#17        half             1 19:22:00
#18        half             1 19:22:00
#19        half             1 19:15:00
#20        half             1 18:51:00
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
  • thanks. I am using `rbind.fill` over `rbind` because there is not a guarantee that my two dataframes will have the same columns – Canovice May 03 '19 at 02:38