data types used in C++ shared library API

Question

I am writing a C++ shared library (.so) whose main function will be to process features from images processed by OpenCV. However, the algorithm in this shared library is not specifically for vision -- it could receive measurements from RADAR, LiDAR, etc.

As I am architecting the library, I am trying to keep the OpenCV dependency out of it and instead use the more general Eigen3 matrix library.

My question is about the interface between application code and my library. I have a public method that will accept a list of position and velocity measurements of each feature from the application:

void add_measurements(std::vector< std::tuple<double, double> > pos, 
                      std::vector< std::tuple<double, double> > vel);

Is it best to leave the data structures on the API interface as primitive as possible, like above? Or should I force the application code to provide a std::vector<Eigen::Vector2d> for measurements?

Further, should I allow std::vector<cv::Point2f> from the application code and then just convert to Eigen3 or whatever internally? This one seems the least useful since the library will still depend on OpenCV.

Would a templated implementation (in the header of your library) be ok? (That would essentially make the .so useless) And do you need to support both double and float points? Will you be the main user of your library/or can you force an arbitrary interface to the libraries users? — chtz, May 09 '17 at 17:00
IMO, making the API "as primitive as possible" would mean `void add_measurements(const double* pos_data, const double* vel_data, size_t num_elements);` This would allow passing `std::vector` of `Eigen::Vector2d` as well as `std::tuple` as well as `cv::Point2d` (with some programming effort by the caller, but without copy-overhead at runtime) — chtz, May 09 '17 at 17:04

Galik · Accepted Answer · 2017-05-06T22:37:20.960

You can use generics to bridge different data conventions without sacrificing performance.

The downside is a possible higher learning curve to the interface.

Firstly, rather than accepting vectors you could accept iterators which allows the user to provide data in other containers such as arrays and lists.

template<typename AccessType, typename PosIter, typename VelIter>
void add_measurements(PosIter p1, PosIter p2, VelIter v1, VelIter v2)
{
    // instantiate type to access coordinates
    AccessType access;

    // process elements

    // Internal representation
    std::vector<std::pair<double, double>> positions;

    for(; p1 != p2; ++p1)
        positions.emplace_back(access.x(*p1), access.y(*p1));

    std::vector<std::pair<double, double>> velocities;

    for(; v1 != v2; ++v1)
        positions.emplace_back(access.x(*v1), access.y(*v1));

    // do stuff with the data
}

Then, if they have a weird data type they want to use like this:

struct WeirdPositionType
{
    double ra;
    double dec;
};

They can create a type to access its internal point coordinates:

// class that knows how to access the
// internal "x/y" style data
struct WeirdPositionTypeAccessor
{
    double x(WeirdPositionType const& ct) const { return ct.ra; }
    double y(WeirdPositionType const& ct) const { return ct.dec; }
};

That then gets 'plugged in' to the generic function:

int main()
{
    // User's weird and wonderful data format
    std::vector<WeirdPositionType> ps = {{1.0, 2.2}, {3.2, 4.7}};
    std::vector<WeirdPositionType> vs = {{0.2, 0.2}, {9.1, 3.2}};

    // Plugin the correct Access type to pull the data out of your weirt type
    add_measurements<WeirdPositionTypeAccessor>(ps.begin(), ps.end(), vs.begin(), vs.end());

    // ... etc
}

Of course you can provide ready made Access types for common point libraries such as OpenCv:

struct OpenCvPointAccess
{
    double x(cv::Point2d const& p) const { return p.x; }
    double y(cv::Point2d const& p) const { return p.y; }
};

Then the use can simply use that:

add_measurements<OpenCvPointAccess>(ps.begin(), ps.end(), vs.begin(), vs.end());

Other downsides are: your entire library has to exist in header files, code bloat, and inappropriate template parameters giving confusing bugs Future versions of the standard should fix some of these (e.g. letting you say your iterator types must be iterators).. — Davislor, May 06 '17 at 22:27
@Davislor There are bound to be trade-offs for *zero overhead* generics but I have not found *code bloat* to be such a problem as some might think given that you only create what you use. It just allows different programs to use different things without rewriting. As far as error messages go, `static_assert` has been here for years and goes a long way to providing very easy to understand errors so we don't need to wait for concepts for that reason anymore. — Galik, May 12 '17 at 09:25

Davislor · Answer 2 · 2017-05-06T22:37:01.293

Remember, you can overload your functions to support many kinds of container. If you think both would be useful, you don’t need to choose between them.

So the major considerations are the overhead of doing it each way, and whether you want to add a dependency on Eigen. If a future version of the library will have a different implementation, you don’t want to use a leaky abstraction.

Another useful trick is to add a type alias, for example, inside a namespace:

using point2d = std::tuple<double, double>;

Which you can later change to:

using point2d = Eigen::vector2d;

Or:

using point2d = cv::Point2f;

You can make these more opaque by wrapping them in a struct. If you do this, future changes will break compatibility with the previous ABI, but not the API.

data types used in C++ shared library API

2 Answers2