Creating a custom int that would enforce always being in a specified range; how to get past integer overflows?

Question

As in the title. As an exercise, I wanted to create an int that would enforce constraints on its value and would disallow being set to values outside its specified range.

Here is how I tried to approach this:

#include <cassert>
#include <cstdint>
#include <iostream>
using namespace std;

int main();

template<typename valtype, valtype minval, valtype maxval>
class ConstrainedValue
{
  valtype val;

  static bool checkval (valtype val)
  {
    return minval <= val && val <= maxval;
  }

public:
  ConstrainedValue() : val{minval} // so that we're able to write ConstrainedValue i;
  {
    assert(checkval(val));
  }

  ConstrainedValue(valtype val) : val{val}
  {
    assert(checkval(val));
  }

  ConstrainedValue &operator = (valtype val)
  {
    assert(checkval(val));
    this->val = val;
    return *this;
  }

  operator const valtype&() // Not needed here but can be; safe since it returns a const reference
  {
    return val;
  }

  friend ostream &operator << (ostream& out, const ConstrainedValue& v) // Needed because otherwise if valtype is char the output could be bad
  {
    out << +v.val;
    return out;
  }

  friend istream &operator >> (istream& in, ConstrainedValue& v) // this is horrible ugly; I'd love to know how to do it better
  {
    valtype hlp;
    auto hlp2 = +hlp;
    in >> hlp2;
    assert(checkval(hlp2));
    v.val = hlp2;
    return in;
  }
};

int main()
{
  typedef ConstrainedValue<uint_least8_t, 0, 100> ConstrainedInt;
  ConstrainedInt i;
  cin >> i;
  cout << i;
  return 0;
}

The problem is that... this is not working. If this custom integer is given values that overflow its underlying type it just sets erroneous values.

For example, let's assume that we have range constraints of [0; 100] and the underlying type is uint_least8_t, as in the example above. uint_least8_t evaluates to char or unsigned char, I'm not sure which. Let's try feeding this program with different values:

10
10

Nice. Works.

101
test: test.cpp:52: std::istream& operator>>(std::istream&, ConstrainedValue<unsigned int, 0u, 100u>&): Assertion `checkval(hlp2)' failed.
Aborted

Haha! Exactly what I wanted.

But:

257
1

Yeah. Overflow, truncate, wrong value, failed to check range correctly.

How to fix this problem?

Serge Ballesta · Answer 1 · 2015-07-17T06:45:32.280

I think that you have a specification problem, that unfortunately implementation did not automatically solve.

As soon as you write : ConstrainedValue(valtype val) : val{val} you lose any hope to be able to detect overflow, because the conversion to valtype happens before you code is called. Because if uint_least8_t is translated to unsigned char which seems to happen in your (and my) implementation, (uint_least8_t) 257 is 2.

To be able to detect overflow, you must use greater integral types in your constructor and operator = methods.

IMHO, you should use templated constructor, operator = and checkval :

template<typename valtype, valtype minval, valtype maxval>
class ConstrainedValue
{
  valtype val;

    template<typename T> static bool lt(valtype v, T other) {
        if (v <= 0) {
            if (other >= 0) return true;
            else return static_cast<long>(v) <= static_cast<long>(other);
        }
        else {
            if (other <= 0) return false;
            else  return static_cast<unsigned long>(v)
                <= static_cast<unsigned long>(other);
        }
    }


    template <typename T> static bool checkval (T val)
    {
        return lt(minval, val) && (! lt(maxval, val));
    }

public:
  ConstrainedValue() : val{minval} // so that we're able to write ConstrainedValue i;
  {
    assert(checkval(val));
  }

  template<typename T> ConstrainedValue(T val) : val{val}
  {
    assert(checkval(val));
  }

  template<typename T> ConstrainedValue &operator = (T val)
  {
    assert(checkval(val));
    this->val = val;
    return *this;
  }

  operator const valtype&() // Not needed here but can be; safe since it returns a const reference
  {
    return val;
  }

That way, the compiler will automatically choose the proper type to avoid early overflow : you use original type in checkval, and use the best of long long and unsigned long long for the comparisons, with care for signed/unsigned comparisons (no compilation warning) !

In fact, lt could be written more simply if you accept a possible (harmless) signed/unsigned mismatch warning :

    template<typename T> static bool lt(valtype v, T other) {
        if (v <= 0) && (other >= 0) return true;
        else if (v >= 0) && (other <= 0) return false;
        else  return v <= other;
        }
    }

The warning could arise if one of valtype and T is signed while the other is unsigned. It is harmless because the cases where v and other are of opposite signs is explicetely processed, and if both are negative, they must be signed. So it can only happen when one is signed and the other unsigned but both are positive. In that case, clause 5 (5 Expressions from Standard for Programming Language C++, § 10) guarantees that the biggest type will be used, with unsigned precedence, meaning that it will be correct for positive values. And it avoids to force a possibly useless conversion to unsigned long long.

But there is still a case that I cannot handle properly : the injector. Until you decode it, you cannot be sure whether the input value should be cast to a long long or to an unsigned long long (assuming they are the biggest possible integral types). The cleanest way I can imagine would be to get the value as a string and decode it by hand. As there are many corner cases, I would advise you to :

first get it as a string
if first char is a minus - decode it to a long long
else decode it to an unsigned long long

It will still give weird results for really big numbers, but it is the best I can :

friend std::istream &operator >> (std::istream& in, ConstrainedValue& v)
{
    std::string hlp;
    in >> hlp;
    std::stringstream str(hlp);
    if (hlp[0] == '-') {
        long long hlp2;
        str >> hlp2;
        assert(checkval(hlp2));
        v.val = static_cast<valtype>(hlp2);
    }
    else {
        unsigned long long hlp2;
        str >> hlp2;
        assert(checkval(hlp2));
        v.val = static_cast<valtype>(hlp2);
    }
    return in;
}

In `operator>>` you know the underlying range. If `minval>=0`, which is always the case for unsigned `valtype`, you can give an out-of-range error immediately when the input starts with `-`. After that check, you can extract directly to `valtype hlp2` and don't need the branch for `unsigned long long` versus `long long`. Obviously you should also set ` failbit` on the stream if you extracted an out-of-range value. — MSalters, Jul 16 '15 at 13:11
@MSalters It is a little more tricky : `-0` is still a valid unsigned value since it is ... `0` ! But there are certainly tons of possible optimisations left :-) — Serge Ballesta, Jul 16 '15 at 13:22
That's fun. I have an impression the problem exists only when reading to a char. I was playing with `unsigned int i; cin >> i; cout << i;` and I found out that for any input outside the range of [0; 4294967295] (including negative values) `i` just ends up with the value of `4294967295`. And it even is specified in C++11, see http://www.cplusplus.com/reference/locale/num_get/get/ Whew... I believe the problem just deserves another question: how to read into a char properly? Because seemingly my horrible hack is not working. — , Jul 16 '15 at 15:12
I have a question though. You use here `static_cast(minval)` and `static_cast(maxval)`. But what if, for example, `T` is an unsigned integer and `minval` equals `-10`? — , Jul 16 '15 at 15:23
I mean, perhaps I lack knowledge in conversions, but wouldn't it be better just to write `minval <= val && val <= maxval` without any `static_cast`s, so that no unnecessary possibly unsafe conversions happen? — , Jul 16 '15 at 15:30
@gaazkam : I've updated comparison function. It should now work in all use cases whatever of T and valtype is unsigned or not and whatever one is greater than the other. — Serge Ballesta, Jul 16 '15 at 18:09
Could you very kindly explain to me why `static_cast` is needed at all? And what's wrong with the simple `minval <= val && val <= maxval`? I'd appreciate to understand more :) — , Jul 16 '15 at 19:14
@gaazkam : I wanted to avoid a possible signed/unsigned mismatch warning in `lt`, and it took me some time to reach the conclusion that the warning is harmless here. I've updated my answer with detailed explaination and reference to standard — Serge Ballesta, Jul 17 '15 at 06:47
@gaazkam:`minval` is a template argument of the specified type. Thus, you **cannot** have a `minval` be both `unsigned long` and `-10`. Trying to do so would give you a `minval=ULONG_MAX-9` — MSalters, Jul 17 '15 at 07:11
@SergeBallesta Very sorry for being pesky. But again... Why is the simple `static bool checkval (T val) {return minval <= val && val <= maxval;}` incorrect? I understand that I might have oversimplified the solution and overlooked something important, but I'd like to know exactly what :) — , Jul 19 '15 at 01:42
@gaazkam : because signed/unsigned mismatch warning means that promotions can lead to unexpected results. Simple example : `char c = -4; unsigned l =2; cout << (c < l);` display ...`0` ! Because `-4` has been promoted to an unsigned long and as such is much bigger than `2` (4294967292 or 0xfffffffc on my 32 bit system) — Serge Ballesta, Jul 19 '15 at 09:23
Oh my, I didn't realise that even plain comparison can be an unsafe operation... I have feeling that writing a safe replacement for int's is going to be much tougher than I thought... Anyway, many thanks. — , Jul 20 '15 at 21:07

score 0 · Answer 2 · answered Jul 16 '15 at 11:20

0

I don't think it's possible to detect overflow on the way in to your constructor, because it is carried out in order to fulfil your constructor argument, so once it reaches your constructor body it has already overflowed.

A possible workaround would be to accept large types in your interface, then carry out the check. For example, you could take long ints in your interface, then store them internally as valtype. Since you'll be carrying out bounds checking anyway, it should be fairly safe.

answered Jul 16 '15 at 11:20

TartanLlama

63,752
13
157
193

1

I was thinking about this workaround, but it has several problems: (a) `uint_least64_t` and `int_least64_t` have non-overlapping ranges, so it would fail for 64bit ints; not to mention problems with custom integer classes, like big int; (b) it will be non-portable, since implementations may define other, larger int types; (c) it might create unnecessary overhaul in case performance is important – Jul 16 '15 at 11:34

iammilind · Answer 3 · 2015-07-16T12:53:21.043

0

Use the biggest version of integer possible for comparison. Let's assume it's intmax_t. Change your code as below:

template<typename valtype, intmax_t minval, intmax_t maxval>
class ConstrainedValue
{
  valtype val;

  // In special case of `uintmax_t` the comparison should happen with `uintmax_t` only, in other cases it will be `intmax_t`
  using Compare = typename std::conditional<std::is_same<valtype, uintmax_t>::value, uintmax_t, intmax_t>::type;
  static bool checkval (valtype val)
  { // this will cover all the scenarios of smaller integer values 
    return Compare(minval) <= Compare(val) && Compare(val) <= Compare(maxval);
  }
  ...

This should resolve the int size problem. I also see other problems with other part of the code, which deserves a new question.

edited Jul 16 '15 at 12:53

answered Jul 16 '15 at 11:45

iammilind

68,093
33
169
336

1

It is better to take **intmax_t** as this is the maximum width integer type by definition. And there is another unsolved issue with **signed/unsignde** values. That could make it a litte bit tricky. – Tunichtgut Jul 16 '15 at 12:31
@Tunichtgut, nice suggestion, edited. Though typically it is typecasted to `int64_t`, for further version of compilers it would be the way to go. However, I don't see an issue with `signed/unsigned` values. That will come only with `uintmax_t` and that is resolved with `Compare` type selector. Have you checked it? – iammilind Jul 16 '15 at 12:52
Why not `template`? – Jul 16 '15 at 14:56
@gaazkam, even though with your approach it will work due to `Compare` type, it will send a wrong notion to reader. e.g. `-10,-1`. – iammilind Jul 16 '15 at 16:24
With `intmax_t maxval` I might prevent myself from setting `maxval` for example to `9 223 372 036 854 775 808` – Jul 16 '15 at 16:27
@gazkaam, you can set any value. Ultimately it will be converted to unsigned if require. Just check yourself. – iammilind Jul 17 '15 at 04:42

score 0 · Answer 4 · 2015-07-20T21:09:09.880

I have composed a solution from Serge Ballesta's answer and from my research in how operator >> works. It looks like this:

#include <cassert>
#include <cstdint>
#include <iostream>
using namespace std;

int main();

template<typename valtype, valtype minval, valtype maxval>
class ConstrainedValue
{
  valtype val;

  template<typename T> static bool lt(valtype v, T other) {
    if (v <= 0) {
        if (other >= 0) return true;
        else return static_cast<long>(v) <= static_cast<long>(other);
    }
    else {
        if (other <= 0) return false;
        else  return static_cast<unsigned long>(v)
            <= static_cast<unsigned long>(other);
    }
}


template <typename T> static bool checkval (T val)
{
    return lt(minval, val) && (! lt(maxval, val));
}

public:
  ConstrainedValue() : val{minval}
  {
    assert(checkval(val));
  }

  template <typename T>
  ConstrainedValue(T val) : val{val}
  {
    assert(checkval(val));
  }

  template <typename T>
  ConstrainedValue &operator = (T val)
  {
    assert(checkval(val));
    this->val = val;
    return *this;
  }

  operator const valtype&()
  {
    return val;
  }

  friend ostream &operator << (ostream& out, const ConstrainedValue& v)
  {
    out << +v.val;
    return out;
  }

  friend istream &operator >> (istream& in, ConstrainedValue& v)
  {
    auto hlp = +v.val; // I think it's safe to assume that hlp will have at least as much precision as v.val?
    in >> hlp;
    assert(in.good()); // In case of input overflow this fails.
    assert(checkval(hlp));
    v.val = hlp;
    return in;
  }
};

int main()
{
  typedef ConstrainedValue<uint_least8_t, 0, 100> ConstrainedInt;
  ConstrainedInt i;
  cin >> i;
  cout << i;
  return 0;
}

I think it covers both problems: passing overflowing values to the constructor or operator = and overflowing input. In the latter case, according to http://www.cplusplus.com/reference/locale/num_get/get/ , input will write numeric_limits::max() or numeric_limits::lowest() to v.val, which should cover most scenarios; but in case maxval equals numeric_limits::max() or minval equals numeric_limits::lowest() we can check in.good() which should necessarily yield false in such situations. Of course, there's always the problem that v.val might be a char type, in which case operator >> will actually write to auto hlp = +v.val which will be of a larger type, which may prevent in.good() from detecting overflows. However, such cases will be handled by Serge's improved checkval() function.

This should hopefully work, assuming that auto hlp = +v.val will necessarily be of an at least as large type as v.val. If the standard says otherwise, or if I have overlooked some possible scenarios, please correct me.

Creating a custom int that would enforce always being in a specified range; how to get past integer overflows?

4 Answers4