-1

My question is if an object in R saved to binary format using the save function can be different if saved from different (but recent) versions of R.
That is because I have a script that makes some calculations and save its results to a file. When reproducing the same calculations later, I decided to compare the two files using

 diff --binary -s mv3p.Rdata mv3p.Rdata.backup

To my surprise the two files are different. However when analysing the contents in R, they are identical.
The new version is 3.3.1. I believe the older version have been created by R 3.3.0 but it could also be by 3.2.x, I am not 100% sure. I used the save command with only the object I wanted to save and the filename arguments.
So my question is : is it normal that the same object is written differently in different versions of R? is it documented somewhere? How can I be sure to be able to reproduce exactly the same file? On what can it depend (R version, OS, processor architecture, etc...)
Please , I am NOT asking if versions can be read by another version of R and I am NOT asking about very old R versions.

user2759511
  • 640
  • 7
  • 16
  • Are you sure the objects created are identical? It's impossible for us to tell if anything happened in the version updates that would change how the objects are stored since you didn't give a reproducible example. – Dason Jan 31 '17 at 18:38
  • 99.9% sure they are identical. There is just one object in each(checked by verbose) and then I compared them using == and got a dataframe with nothing but TRUE. – user2759511 Jan 31 '17 at 19:16
  • about reproducible example : I can't find one either because the older version was on an older computer now gone. But are you implying that they should have been equal, that is , OS, R versions, etc. shouldn't impact the file? give me that answer with some source and I'll be happy to accept it – user2759511 Jan 31 '17 at 19:18
  • thc gave what I believe is the correct answer but I asked because it's not unheard of for an object's representation to change between package versions. Without knowing what code you used to create the objects it was impossible to rule that out. – Dason Jan 31 '17 at 21:23

1 Answers1

0

R data files also include the R version used to write it. That's one reason the files may be different. See here on documentation: http://biostat.mc.vanderbilt.edu/wiki/Main/RBinaryFormat

Also, you can use save(..., ascii=T) to see the difference in plain text.

thc
  • 9,527
  • 1
  • 24
  • 39