7

As the Documentation says, "DumpSave writes out definitions in a binary format that is optimized for input by Mathematica." Is there a way to convert a Mathematica binary dump file back to the list of definitions without evaluating them? Import["file.mx","HeldExpression"] does not work...

Alexey Popkov
  • 9,355
  • 4
  • 42
  • 93
  • A possible work around is that you could `DumpSave` all of the "*Values" then load them back in... Or you could start a new, temporary context then run `Get["file.mx"]` and examine all of the definitions in that context. – Simon Aug 05 '11 at 12:00
  • @Simon `"file.mx"` can create its own context(s) and add additional definitions in any of the existing contexts. And even worse, it can add or partially change definitions for existing symbols. So it is probably very hard to recover its definitions just by comparison of two states of the system. – Alexey Popkov Aug 05 '11 at 12:07
  • True. And the first option I gave isn't very satisfactory. Let's hope someone has some better ideas / understanding than me! – Simon Aug 05 '11 at 12:52
  • 1
    @Simon The one thing I lay my hopes is that the format of the dump files used by *Mathematica* is not unique and it uses just some standard method of creating dump files. So it is probably possible to decode such files if someone just knows this standard (if it exists, of course, but I strongly suspect that it is). – Alexey Popkov Aug 05 '11 at 12:58
  • What do you need the ".mx" files for? Would ".wdx" files be an acceptable [alternative to DumpSave](https://groups.google.com/forum/#!topic/comp.soft-sys.math.mathematica/Cj9gWpJtDBY)? – Simon Aug 05 '11 at 13:21
  • 1
    A bit of googling led me to the perl script: [Mathematica Disassembler](http://www.steike.com/code/mathematica-decompiler/). It didn't work for me (maybe 'cause I'm running 64bit linux and the script is for x86 .mx files). Maybe you'll have better luck. – Simon Aug 05 '11 at 13:25
  • @Simon Probably you have tried this script with dump files created under 64 bit system. But you can easily find a lot of 32 bit dump files in your *Mathematica* installation directory. For example, look in an appropriate subdirectory of the `FileNameJoin[{$InstallationDirectory,"SystemFiles","Kernel","SystemResources"}]` directory. I have no Python installed and no experience with it, so I cannot check myself now. But may be this script is an appropriate solution. – Alexey Popkov Aug 06 '11 at 05:42
  • I just tried the *perl* script on a couple of 32bit the ".mx" files in the directory you mentioned. It still didn't work... But still, maybe you can learn enough from studying it that you can port the solution to a language you're more comfortable with - even Mma itself. – Simon Aug 06 '11 at 06:29
  • @Simon I just need to learn *perl* for this (never experienced this language). General explanations in the comments does not give enough keys to understand how something similar may be coded in MMa. – Alexey Popkov Aug 06 '11 at 08:24
  • Perl can be quite terse - there's a reason it always wins [code golf](http://stackoverflow.com/questions/tagged/code-golf?sort=votes&pagesize=15). – Simon Aug 06 '11 at 09:23

2 Answers2

6

DumpSave stores values associated with the symbol, i.e. OwnValues, DownValues, UpValues, SubValues, DefaultValues, NValues, FormatValues.

All the evaluation was done in the session on Mathematica, and then DumpSave saved the result of it.

These values are stored in internal formal. Reading the MX files only creates symbols and populates them with these values by reading this internal format back, bypassing the evaluator.

Maybe you could share the problem that prompted you to ask this question.


[EDIT] Clarifying on the issue raised by Alexey. MX files save internal representation of symbol definitions. It appears that Mathematica internally keeps track of:
f[x_Real] := x^2 + 1
DumpSave[FileNameJoin[{$HomeDirectory, "Desktop", "set_delayed.mx"}], 
  f];
Remove[f]
f[x_Real] = x^2 + 1;
DumpSave[FileNameJoin[{$HomeDirectory, "Desktop", "set.mx"}], f];
setBytes = 
  Import[FileNameJoin[{$HomeDirectory, "Desktop", "set.mx"}], "Byte"];
setDelayedBytes = 
  Import[FileNameJoin[{$HomeDirectory, "Desktop", "set_delayed.mx"}], 
   "Byte"];

One can, then, use SequenceAlignment[setBytes, setDelayedBytes] to see the difference. I do not know why it is done that way, but my point stands. All the evaluation on values constructed using Set has already been done in Mathematica session before they were saved by DumpSave. When MX file is read the internal representation is read back into Mathematica sessions, and no evaluation of loaded definitions is actually performed.

Sasha
  • 5,935
  • 1
  • 25
  • 33
  • A simple experiment shows that dump files restores immediate definitions too, not converting them to delayed definitions: `f[x_Real]=x^2+1;DumpSave["f.mx",f];Clear[f];< – Alexey Popkov Aug 05 '11 at 17:06
  • A little illustration for the statement "the rules themselves do run through the evaluator": compare `Clear[f];f[x_Real]=x^2+1;DumpSave["f.mx",f];Clear[f];f=a;< – Alexey Popkov Aug 09 '11 at 07:31
  • What is completely non-trivial here, so it is that immediate definitions like `f[x_Real]=x^2+1;` are restored as immediate, not delayed definitions. So the emphasized statement in the citation (emphasis added) "`DumpSave` stores values associated with the symbol, i.e. `OwnValues`, `DownValues`, `UpValues`, `SubValues`, `DefaultValues`, `NValues`, `FormatValues`. *These values are stored as delayed rules*" is obviously wrong. – Alexey Popkov Aug 09 '11 at 07:33
  • @Alexey Yes, it is literally wrong, which you could also see making byte-wise comparison of MX files generated. But I fail to see what evaluation you are trying to prevent, unless you clarify this I am at loss as to what to add. I think it is intentional that reverse engineering of MX file is not straightforward, since lots of Mathematica's own code is stored that way. What you want amounts to reverse engineering content of MX file. – Sasha Aug 09 '11 at 14:36
  • It seems that using of `DumpSave` and `DumpGet` is the only way to save and **exactly** restore original definitions for symbols with guarantee. Even `Save` does not provide such functionality since it stores definitions in the standard form of `Set` and `SetDelayed` definitions which are evaluated again when the exported file is read in. In this way, these definitions may be changed by existing definitions for involved symbols. For example, setting `x=1` will break further restoring of the definition `f=x` etc. – Alexey Popkov Aug 09 '11 at 15:15
  • And `...Values` do not solve the problem since they give all definitions only in delayed form and do not allow to distinguish between immediate and delayed definitions (but allow to create immediate definitions). – Alexey Popkov Aug 09 '11 at 15:17
  • So the real problem with restoring of definitions is that we have no simple way to know whether the definition returned by the `...Values` is delayed or immediate one. But `.mx` files obviously contain such information. – Alexey Popkov Aug 09 '11 at 15:23
  • 2
    Notice that `Save` saves a text file, and Mathematica must parse and evaluate these rules to convert them into internal format, while `DumpSave` saves internal structures, which are read back bypassing the evaluator. Because of this, setting `x` to any value does not affect what is read back. I get the correct value of `10.0` back after evaluating `x = 1.0; Get[ FileNameJoin[{$HomeDirectory, "Desktop", "set.mx"}]]; f[3.0]` So, I am still not seeing any issue. Can you post the code which exhibits the purported problem ? – Sasha Aug 09 '11 at 15:41
  • BTW, your last comment contradicts what you wrote in the answer: "the rules themselves do run through the evaluator". – Alexey Popkov Aug 09 '11 at 16:04
  • The problem is that we cannot read `.mx` files and have no any readable alternative to this format but we obviously could have it! If `Save` would save its definitions in the form of assignments to `..Values` it would be almost what is needed. But it would be even better to be able to read `.mx` files in some way. – Alexey Popkov Aug 09 '11 at 16:07
  • Generally, the problem is: how to exactly restore original definitions for symbols with guarantee? `Save` and `...Values` does not allow to do this in a straightforward way and `.mx` files are machine-dependent and human-unreadable. I do not see the reason why it is not possible. Just a fault, is not it? – Alexey Popkov Aug 09 '11 at 16:23
2

You can assign Rules instead of RuleDelayed's to DownValues, which is equivalent to the immediate definitions. The right-hand side of the assignment stays unevaluated and is copied literally, so the command corresponding to

Clear[f]; 
f[x_Real] = x^2 + 1;
DumpSave["f.mx", f];
Clear[f];
f = a;
<< f.mx;
Definition[f]

would be

Clear[f];
f = a;
DownValues[f] := {f[x_Real] -> x^2 + 1}
Definition[f]

f = a

f[x_Real] = x^2+1

(cf. with your example of Clear[f]; f = a; f[x_Real] = x^2 + 1; Definition[f] which does not work, assigning a rule for a[x_Real] instead). This is robust to prior assignments to x as well.


Edit: It is not robust to side effects of the right-hand side, as an example in the comments below shows. To assign a downvalue avoiding any evaluation one can use the undocumented System`Private`ValueList like in the following:

Clear[f];
f := Print["f is evaluated!"];
DownValues[f] := System`Private`ValueList[f[x_Real] -> Print["definition is evaluated!"]];

(no output)


Note that the assignment got seemingly converted to delayed rules:

DownValues[f]

{HoldPattern[f[x_Real]] :> x^2 + 1}

but Definition (and Save) show that the distinction from a := has internally been kept. I don't know why DownValues don't display the truth.

To answer the original question, you would probably do best with importing the dump file and exporting the relevant symbols using Save, then, if expecting this to be loaded into a kernel tainted by prior definitions, convert the assignments into assignments to DownValues as above programatically. It might be easier to scope the variables in a private context before the export, though, which is what the system files do to prevent collisions.

The Vee
  • 11,420
  • 5
  • 27
  • 60
  • I just have checked with *Mathematica* 5.2, 8.0.4 and 10.3.1 and found that with `SetDelayed` assignment to `DownValues` the right-hand side is nevertheless evaluated, please try: `Clear[f]; f:=Print["f is evaluated!"]; DownValues[f] := {f[x_Real] -> Print["definition is evaluated!"]};` (it prints `"f is evaluated!"` and `"definition is evaluated!"`). – Alexey Popkov Mar 03 '16 at 13:36
  • 1
    Thanks for finding this! It actually forms an example of difference between assigning a `List` and a ``System`Private`ValueList`` to `DownValues` I could not locate in my recent self-answer elsewhere, http://mathematica.stackexchange.com/a/108974/6041. So, with that in mind, I think this is a solution: ``Clear[f]; f := Print["f is evaluated!"]; DownValues[f] := System`Private`ValueList[ f[x_Real] -> Print["definition is evaluated!"]];`` – The Vee Mar 03 '16 at 14:00
  • Good finding! Please correct your answer so as not to mislead the future readers. Another approach is to use undocumented [``Language`ExtendedDefinition`` and ``Language`DefinitionList``](http://mathematica.stackexchange.com/q/89607/280) in the same way. – Alexey Popkov Mar 03 '16 at 14:23