2

I am doing some tests with Python/C API to understand how does it work and how properly use it. My goal is to create C++ wrapper, that allows me to run Python scripts from C++ code. I cannot use external bindind libraries (like Boost.Python or Cython). Everything is going fine, except one thing: now, I'm loading scripts using PyImport_Import()

PyObject* py_module = PyImport_Import(py_module_name); //imports *.py file
//do something, call functions, save results
PyDecRef(py_module);

In release version, however, they have to be distributed in proprietary binary format and loaded into memory on startup. I was looking for some hints how to achieve this - with no result. Basically, I need to do something like this:

File* file = fopen("scripts.bin", "rb");
char* c_buff = malloc(...);
fread(c_buff, file, ...);
PyObject* py_module = CreatePyObjectFromBinaryData(c_buff, ...);

Could smeone provide possible solutions? I thought about using marshalling functionality in this way:

FILE* file = fopen("scripts.bin", "wb");
PyMarshal_WriteObjectToFile(py_module, file, Py_MARSHAL_VERSION);

However, this doesn't seem to work. In fact, I am not sure what objects can be marshalled this way, because documentation doesn't say anything about it.

Optional question: I have all *.py files in my binary folder. On startup, after PyImport_Import() they are implcitly compiled to bytecode (.pyc) in /pycache. I know bytecode files (.pyc or *.pyo) can be created using compileall module. Is it possible to create PyObject containing module data using contents of such file?

Mateusz Grzejek
  • 11,698
  • 3
  • 32
  • 49
  • Can you just ship the python scripts? – Yakk - Adam Nevraumont Jan 11 '14 at 21:33
  • It was one of the initial requirements I was told of. I am responsible for 'reconnaissance' - what are the possible solutions and how complex they are. If solution for this problem is not easy to achieve, we won't force it and simply go for shipping app with 'raw' scripts. It is much easier - Py_CompileStringExFlags and PyEval_EvalCodeEx and that's it. Question is: is it possible to serialize code object returned by Py_CompileStringExFlags and use it's content instead of raw file, that is compiled every time application starts. – Mateusz Grzejek Jan 11 '14 at 22:39
  • @Mateusz> ...that is compiled every time application starts. <-- well that is not entirely true. If I understand you correctly and you are referring to .pyc python bytecode, it is generated once --and you can even force offline generation from python itself. Given that bytecode can be "easily" decompiled, why not stick with that? A simple deployment script can copy/generate .py files and force startup .pyc generation. – MariusSiuram Aug 07 '14 at 06:24

1 Answers1

0

Is it acceptable to just ship the .pyc files? Try something like this:

int size;
unsigned char *python_code;
PyObject *mainobj;
size = load_file("multiply.pyc", &python_code);    
Py_Initialize();
codeobj = PyMarshal_ReadObjectFromString(python_code+8, size-8);
mainobj = PyImport_ExecCodeModule("multiply", codeobj);
Py_Finalize();

See this discussion... https://groups.google.com/forum/#!topic/comp.lang.python/zhIe_Aa2Ih8 Where Caersten Haese says:

A pyc file contains the following:

1) An 8 byte header containing a magic number. 2) A "marshal" serialization of the code object.

So, in order to transform those contents into a code object, you need to skip the 8 byte header and an unmarshal the rest.

Failing that use encrypted zip files containing the .pyc files? (you need to encode the pwd in your c++ code).

I think truly secure stuff is really difficult (impossible?) if you don't trust the client.

demented hedgehog
  • 7,007
  • 4
  • 42
  • 49