0

I am trying to wrap a class which returns a string.

class SS {
  public:
    SS(const std::string& s) : data_(s.data()), size_(s.size()) {}

    // Return a pointer to the beginning of the referenced data
    const char* data() const { return data_; }

    const char* data_;
    size_t size_;
};

class PySS: public SS {
  public:
    PySS(const std::string &str): SS(str) {
      std::cout << "cons " << str << std::endl; #key1
      std::cout << "data " << SS::data() << std::endl; # key1

    }

    std::string data() {
      std::cout << "call data " << SS::data() << std::endl; # p��
      return std::string(SS::data());
    }
};

void init_slice(py::module & m) {
  py::class_<PySS>(m, "SS")
    .def(py::init<const std::string&>())
    .def("data", &PySS::data);
}

When calling from python,

s = SS('key1')
print (s.data())

it throws unicode error

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xae in position 1: invalid start byte

I print the string in the constructor, and it shows the identical result. But in the other function it shows some uninterpreted string.

any idea?

[Edit]

Here is the minimal example to reproduce the similar issue.

class SS {
  public:
    SS(const std::string& s) : data_(s.data()) {}

    // Return a pointer to the beginning of the referenced data
    const char* data() const { return data_; }
    const std::string ToString() const {
      std::cout << std::string(data_) << std::endl;
      return std::string(data_);
    }

    const char* data_;
};

void init_slice(py::module & m) {
  py::class_<SS>(m, "Slice")
  .def(py::init<const std::string&>())
  .def("data", &SS::ToString);
}
alec.tu
  • 1,647
  • 2
  • 20
  • 41

1 Answers1

0

Problem & solution

There are several problems with your example, the most important one is that your pointers are invalid, because they point to something that goes out-of-scope (your s argument of class SS).

The solution is to copy s to a member variable of class SS as follows:

#include <string>
#include <iostream>
#include <pybind11/pybind11.h>

namespace py = pybind11;

class SS {
  public:
    SS(const std::string& s) : m_data(s) {}

    const char* data() const { return m_data.data(); }

    std::string m_data;
};

class PySS: public SS {
  public:
    PySS(const std::string& s): SS(s) {}

    std::string get() { return std::string(SS::data()); }
};

PYBIND11_MODULE(example, m)
{
  py::class_<PySS>(m, "SS")
    .def(py::init<const std::string&>())
    .def("get", &PySS::get);
}

Two more remarks:

  • In your example the macro PYBIND11_MODULE was missing, which takes care of some general things to be able to import your module (see this example).
  • I would never declare the same function with two different meanings: your SS::data() returns a pointer, while PySS::data() returns a copy (a std::string). I thus renamed the latter to PySS::get() to make the distinction clear.

Work-around for third-party classes

Given that you class SS it outside of you control, I think that you can only work around the problem by wrapping it. For example:

#include <string>
#include <iostream>
#include <pybind11/pybind11.h>

namespace py = pybind11;

class SS {
  public:
    SS() = default;
    SS(const std::string& s) : data_(s.data()), size_(s.size()) {}
    const char* data() const { return data_; }

  private:
    const char* data_;
    size_t size_;
};

class PySS {
  public:
    PySS(const std::string& s) { m_data = s; m_SS = SS(m_data); }
    std::string get() { return std::string(m_SS.data()); }

  private:
    std::string m_data;
    SS m_SS;
};

PYBIND11_MODULE(example, m)
{
  py::class_<PySS>(m, "SS")
    .def(py::init<const std::string&>())
    .def("get", &PySS::get);
}
Tom de Geus
  • 5,625
  • 2
  • 33
  • 77
  • In fact, the class SS is a third-party library I want to bind with, so it's not modifiable. And I can not change char* to std::string. Is there any alternative solution? – alec.tu Apr 23 '19 at 07:47
  • @alec.tu Are you sure the your third-party library is constructed in this way?? – Tom de Geus Apr 23 '19 at 08:40
  • yup. I am pretty sure. – alec.tu Apr 23 '19 at 09:04
  • @alec.tu Then I think you have no other option than to wrap the class. If you sub-class it like you did, you will always construct it first (before child-class variables), hence you'd have no ways to prevent the pointer to go out-of-scope. – Tom de Geus Apr 23 '19 at 12:18
  • @alec.tu I posted an example how to wrap. – Tom de Geus Apr 23 '19 at 12:23