8

I have a tibble, df:

> df
# A tibble: 4 x 5
    profile Sepal.Length Sepal.Width Petal.Length Petal.Width
      <chr>        <dbl>       <dbl>        <dbl>       <dbl>
1 Profile 1       -1.011       0.850       -1.301      -1.251
2 Profile 2        0.542      -0.389        0.662       0.673
3 Profile 3       -0.376      -0.967        0.115       0.038
4 Profile 4        1.502       0.158        1.277       1.239

When I use `tidyr::gather(), as follows:

tidyr::gather(df, var, val, -profile)

The following error is returned:

Warning message: attributes are not identical across measure variables; they will be dropped 

I did some searching (and checked to see whether df has any attributes that might be causing the issue), but can't understand why this warning is being printed.

df <- structure(list(profile = c("Profile 1", "Profile 2", "Profile 3", 
"Profile 4"), Sepal.Length = structure(c(-1.011, 0.542, -0.376, 
1.502), .Dim = c(150L, 1L), "`scaled:center`" = 5.84333333333333, "`scaled:scale`" = 0.828066127977863), 
    Sepal.Width = structure(c(0.85, -0.389, -0.967, 0.158), .Dim = c(150L, 
    1L), "`scaled:center`" = 3.05733333333333, "`scaled:scale`" = 0.435866284936698), 
    Petal.Length = structure(c(-1.301, 0.662, 0.115, 1.277), .Dim = c(150L, 
    1L), "`scaled:center`" = 3.758, "`scaled:scale`" = 1.76529823325947), 
    Petal.Width = structure(c(-1.251, 0.673, 0.038, 1.239), .Dim = c(150L, 
    1L), "`scaled:center`" = 1.19933333333333, "`scaled:scale`" = 0.762237668960347)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -4L), .Names = c("profile", 
"Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"))

EDIT:

When I print df, it looks fine:

> df
# A tibble: 2 x 5
    profile Sepal.Length Sepal.Width Petal.Length Petal.Width
      <chr>        <dbl>       <dbl>        <dbl>       <dbl>
1 Profile 1       -1.011       0.850       -1.301      -1.251
2 Profile 2        0.506      -0.425        0.650       0.625

However, when I run dput(df), and then run the code that it output (the same code as above), the error identified by @neilfws is returned.

Joshua Rosenberg
  • 4,014
  • 9
  • 34
  • 73
  • 3
    I think there's an issue with your example data: "Error in attributes(.Data) <- c(attributes(.Data), attrib) : dims [product 150] do not match the length of object [4]" – neilfws Aug 23 '17 at 00:44
  • Hm, that's beguiling. When I print the data (example data), it prints fine (see edit above). But, when I then use dput(df), and then execute that output, the same error you found is returned. – Joshua Rosenberg Aug 23 '17 at 01:00
  • 2
    The original iris data is 150 rows; maybe you used a subset to create the example data but the 150 has carried over somehow? – neilfws Aug 23 '17 at 01:03
  • Hm, I aggregated the data using `dplyr::group_by()` and `dplyr::summarize()`. – Joshua Rosenberg Aug 23 '17 at 13:05

1 Answers1

1

I came across this recently, and have some thoughts I could add that might be helpful for future reference for others.

Agree with @neilfws that one of the attribute errors is due to having .Dim = c(150, 1L) in the column attributes - referring to 150 rows of iris data, but this subset has only 4 rows of data.

However, the warning that was returned (also in the post title):

Warning message: attributes are not identical across measure variables; they will be dropped

refers to attributes being dropped when combining data.frame columns with gather, since they are not completely identical. The df in this post does have attributes which appear from being scaled:

R> str(df)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   4 obs. of  5 variables:
 $ profile     : chr  "Profile 1" "Profile 2" "Profile 3" "Profile 4"
 $ Sepal.Length: num [1:4, 1] -1.011 0.542 -0.376 1.502
  ..- attr(*, "`scaled:center`")= num 5.84
  ..- attr(*, "`scaled:scale`")= num 0.828
 $ Sepal.Width : num [1:4, 1] 0.85 -0.389 -0.967 0.158
  ..- attr(*, "`scaled:center`")= num 3.06
  ..- attr(*, "`scaled:scale`")= num 0.436
 $ Petal.Length: num [1:4, 1] -1.301 0.662 0.115 1.277
  ..- attr(*, "`scaled:center`")= num 3.76
  ..- attr(*, "`scaled:scale`")= num 1.77
 $ Petal.Width : num [1:4, 1] -1.251 0.673 0.038 1.239
  ..- attr(*, "`scaled:center`")= num 1.2
  ..- attr(*, "`scaled:scale`")= num 0.762

But this is just a warning - it will still work, but you lose the attr information since it varies across the variables when combined.

If you have a similar data.frame without these attr - for example:

structure(list(profile = structure(1:4, .Label = c("Profile 1", 
"Profile 2", "Profile 3", "Profile 4"), class = "factor"), Sepal.Length = c(-1.011, 
0.542, -0.376, 1.502), Sepal.Width = c(0.85, -0.389, -0.967, 
0.158), Petal.Length = c(-1.301, 0.662, 0.115, 1.277), Petal.Width = c(-1.251, 
0.673, 0.038, 1.239)), class = "data.frame", row.names = c(NA, 
-4L))

then gather will work without any warning.

Also, as of latest tidyr, it is recommended to use pivot_longer instead of gather as gather will not be maintained.

library(tidyr)

pivot_longer(df, cols = -profile, names_to = "var", values_to = "val")
Ben
  • 28,684
  • 5
  • 23
  • 45