A error about converting Python binary to C++ binary in boost.python

Question

I have to convert some binary from Python to C++ by boost::python.The binary maybe come from image or text file.But some error occur for converting the image file's binary into c++.The following is an example.

C++

#include <boost/python.hpp>
#include <boost/python/module.hpp>
#include <boost/python/def.hpp>
#include <fstream>
#include <iostream>

using namespace boost::python;
void greet(char *name,char *ss)
{
    std::ofstream fout(name,std::ios::binary);
    std::cout<< "This length is:" << strlen(ss) <<std::endl;
    fout.write(ss.strlen);
    fout.close();
    return;
}

BOOST_PYTHON_MODULE(ctopy)
{
    def("greet",greet);
}

python:

import ctopy
#It is right.
f=open("t1.txt","rb")
ctopy.greet("t2.txt",f.read())
f.close()

#Do a error.There isn't data in the file "p2.jpg".
f2=open("p1.jpg","rb")
ctopy.greet("p2.jpg",f2.read()) #error.the file "p2.jpg" will be a empty file.
f2.close()

How to convert image's binary to c++?

Your question isn't clear at all, what **exactly** are you trying to accomplish, and what error happens? Edit your question with the required information. — StoryTeller - Unslander Monica, Feb 11 '13 at 14:55
@StoryTeller,I have writed more.It will creat a error file at last python code. — simon, Feb 11 '13 at 15:15

Tanner Sansbury · Answer 1 · 2013-02-12T19:24:30.230

The encoding of a binary file generally depends on factors other than the programming language, such as the type of file, operating system, etc. For example, on POSIX, a text-file contains characters organized into zero or more lines, and it is represented the same way in both C++ and Python. Both languages just need to use the proper encoding for the given format. In this case, there is no special process in converting a Python binary stream to a C++ binary stream, as it is a raw byte stream in both languages.

There are two issues with the approach in the original code:

strlen() should be used to determine the length of a null-terminated character string. If the binary file contains bytes with a value of \0, then strlen() will not return the entire size of the data.
The size of the data is lost because char* is used instead of std::string. Consider using std::string as it provides both the size via size() and permits null characters within the string itself. An alternative solution is to explicitly provide the size of the data alongside the data.

Here is a complete Boost.Python example:

#include <boost/python.hpp>
#include <fstream>
#include <iostream>

void write_file(const std::string& name, const std::string& data)
{
  std::cout << "This length is: " << data.size() << std::endl;
  std::ofstream fout(name.c_str(), std::ios::binary);
  fout.write(data.data(), data.size());
  fout.close();
}

BOOST_PYTHON_MODULE(example)
{
  using namespace boost::python;
  def("write", &write_file);
}

And the example Python code (test.py):

import example

with open("t1.txt", "rb") as f:
    example.write("t2.txt", f.read())

with open("p1.png", "rb") as f:
    example.write("p2.png", f.read())

And the usage, where I download this image and create a simple text file, then create a copy of them with the above code:

[twsansbury]$ wget http://www.boost.org/style-v2/css_0/get-boost.png -O p1.png >> /dev/null 2>&1
[twsansbury]$ echo "this is a test" > t1.txt
[twsansbury]$ ./test.py 
This length is: 15
This length is: 27054
[twsansbury]$ md5sum t1.txt t2.txt
e19c1283c925b3206685ff522acfe3e6  t1.txt
e19c1283c925b3206685ff522acfe3e6  t2.txt
[twsansbury]$ md5sum p1.png p2.png
fe6240bff9b29a90308ef0e2ba959173  p1.png
fe6240bff9b29a90308ef0e2ba959173  p2.png

The md5 checksums match, indicating the file contents are the same.

Thank you.But Only one respondent.@doomster answer this question first. — simon, Feb 12 '13 at 03:18

score 0 · Accepted Answer · answered Feb 11 '13 at 21:23

Please provide the real code, after you created a minimal example from it. Further, which Python version are you using? Anyhow, here's a few things that are wrong in the code you provided:

You should use const, as any C++ FAQ will tell you.
You are using strlen() on something which isn't even guaranteed to be zero-terminated but which can well contain zeroes in the middle.
You should use std::string, which doesn't barf if you have internal null chars.
Closing the file is useless, that is done automatically in the dtor. Flushing and checking the streamstate is much more interesting. On failure, throw an exception.
Drop the trailing return, it doesn't hurt but it's unnecessary noise.
Read PEP 8.
Use a with-statement for reading the files.

A error about converting Python binary to C++ binary in boost.python

2 Answers2