Convert string to integer type T, checking for overflow

Question

The goal is a function which, given a string containing a number, and an integral type T, returns success + converted value if the value fits in the type without overflow, or failure otherwise.

Using std::istringstream to read a number from a string works for some cases:

template<typename T>
std::pair<bool, T> decode(std::string s)
{
    T value;
    std::istringstream iss(s);
    iss >> std::dec >> value;
    return std::pair<bool, T>(!iss.fail(), value);
}

template<typename T>
void testDecode(std::string s)
{
    std::pair<bool, T> result = decode<T>(s);
    if (result.first)
        std::cout << +result.second;
    else
        std::cout << "ERROR";
    std::cout << std::endl;
}

int main()
{
    testDecode<int32_t>("12"); // 12
    testDecode<int16_t>("1000000"); // ERROR
    testDecode<int16_t>("65535"); // ERROR
    return 0;
}

However, it fails for 8-bit types (as those are treated as chars):

    testDecode<uint8_t>("12"); // 49 !

Negative numbers are also incorrectly accepted and parsed into unsigned types:

    testDecode<uint16_t>("-42"); // 65494 !

Such functionality is provided by e.g. std.conv.to in D and str::parse in Rust. What would be the C++ equivalent?

If you can use C++17 you may want to consider using std::optional instead of std::pair :) (https://en.cppreference.com/w/cpp/utility/optional) — Juarrow, Jul 29 '18 at 12:46
Thanks for the suggestion. This project is actually stuck in C++03 — Vladimir Panteleev, Jul 29 '18 at 12:48
I would try to avoid doing this sort of thing yourself, consider [boost::conversion::try_lexical_convert](https://www.boost.org/doc/libs/1_65_1/doc/html/boost_lexical_cast/synopsis.html#boost_lexical_cast.synopsis.try_lexical_convert) — Thomas, Jul 29 '18 at 18:33

jan.sende · Accepted Answer · 2018-07-29T11:25:01.940

Your solution seems fine. However, if you want better compile time optimization, you should consider doing template specializations. In the presented code, you do branching dependent on types. This however might be compiled into the code, instead of being resolved at compile time (where it is already possible). Furthermore, if you want to add additional checks depending on the type, the function quickly gets confusing.

I wrote my own version of your conversion code, which in addition to yours, checks if scientific notation was given for integer types:

#include <type_traits>
#include <utility>
#include <string>
#include <limits>
#include <algorithm>

template <typename T>
auto to_T(const std::string &s) -> std::enable_if_t<std::is_floating_point<T>::value, std::pair<bool, T>>
{
    return std::pair<bool, T>{true, T(std::stold(s))}; //read the string into the biggest floating point possible, and do a narrowing conversion
}
template <typename T>
auto to_T(const std::string &s) -> std::enable_if_t<!std::is_floating_point<T>::value && std::is_signed<T>::value, std::pair<bool, T>>
{
    return ((long long)(std::numeric_limits<T>::min()) <= std::stoll(s) && //does the integer in the string fit into the types data range?
            std::stoll(s) <= (long long)(std::numeric_limits<T>::max()))
               ? std::pair<bool, T>{true, T(std::stoll(s))}
               : std::pair<bool, T>{false, 0}; //if yes, read the string into the biggest possible integer, and do a narrowing conversion
}
template <typename T>
auto to_T(const std::string &s) -> std::enable_if_t<!std::is_floating_point<T>::value && std::is_unsigned<T>::value, std::pair<bool, T>>
{
    return ((unsigned long long)(std::numeric_limits<T>::min()) <= std::stoull(s) && //does the integer in the string fit into the types data range?
            std::stoull(s) <= (unsigned long long)(std::numeric_limits<T>::max()))
               ? std::pair<bool, T>{true, T(std::stoull(s))}
               : std::pair<bool, T>{false, 0}; //if yes, read the string into the biggest possible integer, and do a narrowing conversion
}
template <typename T>
auto decode(const std::string &s) -> std::enable_if_t<std::is_floating_point<T>::value, std::pair<bool, T>>
{
    return s.empty() ? //is the string empty?
               std::pair<bool, T>{false, 0}
                     : to_T<T>(s); //if not, convert the string to a floating point number
}
template <typename T>
auto decode(const std::string &s) -> std::enable_if_t<!std::is_floating_point<T>::value && std::is_signed<T>::value, std::pair<bool, T>>
{
    return (s.empty() ||                                                 //is the string empty?
            std::find(std::begin(s), std::end(s), '.') != std::end(s) || //or does it not fit the integer format?
            std::find(std::begin(s), std::end(s), ',') != std::end(s) ||
            std::find(std::begin(s), std::end(s), 'e') != std::end(s) ||
            std::find(std::begin(s), std::end(s), 'E') != std::end(s))
               ? std::pair<bool, T>{false, 0}
               : to_T<T>(s); //if not, convert the string to a signed integer value
}
template <typename T>
auto decode(const std::string &s) -> std::enable_if_t<!std::is_floating_point<T>::value && std::is_unsigned<T>::value, std::pair<bool, T>>
{
    return (s.empty() ||                                                 //is the string empty?
            std::find(std::begin(s), std::end(s), '.') != std::end(s) || //or does it not fit the integer format?
            std::find(std::begin(s), std::end(s), ',') != std::end(s) ||
            std::find(std::begin(s), std::end(s), 'e') != std::end(s) ||
            std::find(std::begin(s), std::end(s), 'E') != std::end(s) ||
            std::find(std::begin(s), std::end(s), '-') != std::end(s))
               ? //or does it have a sign?
               std::pair<bool, T>{false, 0}
               : to_T<T>(s); //if not, convert the string to an unsigned integer value
}

This still needs to be somewhat ported between platforms, because std::stold, std::stoll or std::stoull might not be available. But besides that, it should be independent of the platforms type implementation.

Edit:

I forgot a case where decode should not read the numbers, but returned 0 instead. This is now fixed.

score 1 · Answer 2 · answered Jul 29 '18 at 09:24

1

Here is the straight-forward approach of using istringstream plus working around its caveats:

template<typename T>
std::pair<bool, T> decode(std::string s)
{
    typedef std::pair<bool, T> Result;

    if (s.empty())
        return Result(false, 0);

    if (!std::numeric_limits<T>::is_signed && s[0] == '-')
        return Result(false, 0);

    if (sizeof(T) == 1)
    {
        // Special case for char
        std::pair<bool, short> result = decode<short>(s);
        if (!result.first)
            return Result(false, 0);
        if (!inrange(result.second, std::numeric_limits<T>::min(), std::numeric_limits<T>::max()))
            return Result(false, 0);
        return Result(true, (T)result.second);
    }
    else
    {
        T value;
        std::istringstream iss(s);
        iss >> std::dec >> value;
        return Result(!iss.fail(), value);
    }
}

answered Jul 29 '18 at 09:24

Vladimir Panteleev

24,651
6
70
114

`sizeof(T) == 1` might not work in all possible cases. Use `type_traits` instead! `if (std::is_same::value)` Plus, do you really want to fail on negative input to unsigned types? Just taking the absolute value might be more useful... Furthermore, you are trying to use the concept of ``. I would recommend using this instead of the `pair`. (If c++17 is a possibility for you.) – jan.sende Jul 29 '18 at 09:34
`std::is_same` is not enough because `istringstream` seems to parse characters for both `char` and `unsigned char`. Maybe `sizeof(T) == sizeof(char)`? And, yes, failing on negative input is the expected behavior. If a user inputs a negative value in a configuration file, input, etc... where it wasn't expected, it should not be silently interpreted as a large positive number. – Vladimir Panteleev Jul 29 '18 at 09:38
Actually, `std::is_same` seems to be an overfit and evaluate to `false` even for `int8_t` (i.e. `std::is_same::value` is false even though `istringstream` does parse `int8_t` as a character). – Vladimir Panteleev Jul 29 '18 at 09:42
1

Your code is subtly buggy; Recall C++ only guarantees **`sizeof(char) <= sizeof(short)`**; You will have infinite recursion should control-flow reach your third `if(...` and you are on a platform with the rare case of **`sizeof(char) == sizeof(short)`** – WhiZTiM Jul 29 '18 at 09:57
`char` doesn't figure at all in this implementation. The infinite recursion will happen if `sizeof(short) == 1`, is that what you meant? – Vladimir Panteleev Jul 29 '18 at 10:06
@VladimirPanteleev, yes; – WhiZTiM Jul 29 '18 at 10:28
`std::is_same` test if your types are the same. Therefore `std::is_same::value` being `false` is completely right. It seems like you want a function `std::is_similar`, which unfortunately does not exist. However, you could just do `if (std::is_same::value || std::is_same::value)` instead. – jan.sende Jul 29 '18 at 11:07

Convert string to integer type T, checking for overflow

2 Answers2