I want to read a plain binary file containing a number of unsigned 16-bit integers into an Eigen matrix, and I wrote a templated utility to do this. This is what the caller looks like:
Matrix<uint16_t, Dynamic, Dynamic> data;
int retval = read_data<Matrix<uint16_t, Dynamic, Dynamic>, uint16_t>(
argv[1], data);
And here's what read_data
looks like:
template <typename Derived, typename Scalar> // Per @Jarod42, get rid of Scalar here (*)
int read_data(const char* const fname, MatrixBase<Derived>& data) {
// (*) If we don't have Scalar as a template, just uncomment this:
// typedef typename Derived::Scalar Scalar;
ifstream fin(fname, ios::binary);
if (!fin) {
return 2;
}
fin.seekg(0, fin.end);
long long bytes = fin.tellg();
if (bytes % sizeof(Scalar) != 0) {
// The available number of bytes won't fill an even number of Scalar
// values
return 3;
}
long long nscalars = bytes / sizeof(Scalar);
// See http://forum.kde.org/viewtopic.php?f=74&t=107551
MatrixBase<Derived>& data_edit = const_cast<MatrixBase<Derived>&>(data);
data_edit.derived().resize(nscalars, 1);
Scalar* buffer = new Scalar[nscalars]; // Switched to vector per @Casey
fin.seekg(0, fin.beg);
fin.read(reinterpret_cast<char*>(buffer), bytes);
if (!fin) {
// All data not read. fin.gcount() will indicate bytes read.
return 4;
}
for (long long idx = 0; idx < nscalars; ++idx) {
data_edit(idx) = buffer[idx];
}
return 0;
}
In brief,
- the file is opened,
- its size is obtained,
- an array is dynamically allocated to store all the data,
- the file is read into the array,
- the array's contents are copied into the matrix.
This is reasonable and it works (though I'm open to suggestions for improvements), but I think the function has one too many template parameters, and the function call in the caller is just too verbose. I think there should be a way to eliminate the second template parameter, which only serves to tell read_data
the number of bytes per scalar (2 in the case of uint16_t
), and which I believe should be inferrable using the first template parameter.
Questions Is there a way to eliminate the seemingly redundant second template parameter to read_data
?
Also, is my approach of passing in a matrix reference only to resize it in the read_data
function (using the verbose and confusing idiom of creating a modifiable reference to the matrix in order to resize it via derived()
) the right way to proceed? I realize this dynamically allocates memory, which is fine, but I think it is not doing anything wasteful---correct?.
(Discussion Is there other improvements to this code one would like to see? I'm a C or Python numerical coder; in C, I'd just deal with void*
arrays and pass an extra function argument telling the function the size of each scalar; with Python I'd just do numpy.fromfile('path/to/file.bin', dtype=numpy.uint16)
and be done with it. But I'd like to do it right by Eigen and C++.)
NB. I use matrixes instead of vectors because I'll be resizing them into rectangular matrixes later.
NB2. In Fixed Sized Eigen types as parameters the notion of templating the function using the scalar type is promoted. I am not averse to this approach, I chose to pass read_data
a matrix reference instead of making it return a Matrix object because I wanted integer return values indicating errors---though now I realize I ought to make those exceptions.
NB3. In c++ check for nested typedef of a template parameter to get its scalar base type an elaborate set of helper classes is used to, I think, achieve a similar effect for templated classes. I'm curious if a simpler approach can be used here, for a templated function.
NB4. A simple improvement that I'm aware of is typedeffing Matrix<uint16_t, Dynamic, Dynamic>
to reduce verbosity.