0

I'm working on a project for my univertitiy studies. My goal is to read double numbers from a large file (2,6 GB) into a double vector.

I am working with the boost spirit x3 library with mmap. I have found some code in the net: https://github.com/sehe/bench_float_parsing which i am using.

Before pushing these double values into the vector i would like to do some arithmetic operations on these. So here i'm stuck. How can i do some artihmetic operations to double values before pushing them?

    template <typename source_it>
    size_t x3_phrase_parse<data::float_vector, source_it>::parse(source_it f, source_it l, data::float_vector& data) const {
        using namespace x3;
        bool ok = phrase_parse(f, l, *double_ % eol, space, data);
        if (ok)
            std::cout << "parse success\n";
        else
            std::cerr << "parse failed: '" << std::string(f, l) << "'\n";

        if (f != l) std::cerr << "trailing unparsed: '" << std::string(f, l) << "'\n";
        std::cout << "data.size(): " << data.size() << "\n";
        return data.size();
    }
squbr_
  • 3
  • 1
  • Why not use [semantic actions](https://www.boost.org/doc/libs/1_70_0/libs/spirit/doc/html/spirit/qi/reference/action.html) to perform the arithmetic operations? – user1681377 Aug 12 '19 at 02:26

3 Answers3

1

I am sorry to not exactly answer your question. But boost spirit is not the appropriate tool. Spirit is a parser generator (as a subset is does of course also lexical analysis) . So, one level to high in the Chomsky hiearchy of languages. You do not need a parser but regular expressions: std:regex

A double can easily be found with a regular expression. In the attached code, I created a simple pattern for a doubles. And a regex can be used to search for it.

So, we will read from an istream (what can be a file, a stringstream, console input or whatever). We will read line by line, until the whole input is consumed.

For each line, we will check, if the input matches the expected pattern, being 1 double.

Then we read this double, do some calculations and then push it into the vector.

Please see the following very simple code.

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <regex>

std::istringstream input{R"(0.0
1.5
2.0
3.0
4.0
-5.0
)"};

using VectorDouble = std::vector<double>;
const std::regex reDouble{R"(([-+]?[0-9]*\.?[0-9]*))"};

std::istream& get(std::istream& is, VectorDouble& dd)
{
    // Reset vector to empty before reading
    dd.clear();

    //Read all data from istream
    std::string line{};
    while (getline(is, line)) {
        // Search for 2 doubles
        std::smatch sm;
        if (std::regex_search(line, sm, reDouble)) {
            // Convert found strings to double
            double d1{std::stod(sm[1])};
            // Do some calculations
            d1 = d1 + 10.0;
            // Push back into vector
            dd.emplace_back(d1);
        }
        else
            std::cerr << "Error found in line: " << line << "\n";
    }
    return is;
}

int main()
{
    // Define vector and fill it
    VectorDouble dd{};
    (void)get(input, dd);

    // Some debug output
    for (double& d : dd) {
        std::cout << d << "\n";
    }
    return 0;
}
A M
  • 14,694
  • 5
  • 19
  • 44
0

Why not use semantic actions to perform the arithmetic operations?

user1681377
  • 93
  • 1
  • 8
0

In the following code:

#include <iostream>
#include <sstream>
#include <string>
#include <cstdio>
#include <vector>

using VectorDouble = std::vector<double>;
void show( VectorDouble const& dd)
{
    std::cout<<"vector result=\n";
    for (double const& d : dd) {
        std::cout << d << "\n";
    }
}

auto arith_ops=[](double&x){ x+=10.0;};

std::string input_err_yes{R"(0.0
1.5
2.0xxx
not double
4.0
-5.0
)"};

std::string input_err_not{R"(0.0
1.5
2.0
3.0
4.0
-5.0
)"};

void stod_error_recov(std::string const&input)
//Use this for graceful error recovery in case input has syntax errors.
{
    std::cout<<__func__<<":\n";
    VectorDouble dd;

    std::istringstream is(input);
    std::string line{};
    while (getline(is, line) ) {
        try {
            std::size_t eod;
            double d1(std::stod(line,&eod));
            arith_ops(d1);
            dd.emplace_back(d1);
            auto const eol=line.size();
            if(eod!=eol) {
               std::cerr << "Warning: trailing chars after double in line: "<< line << "\n";
            }
        }
        catch (const std::invalid_argument&) {
            if(!is.eof())
              std::cerr << "Error: found in line: " << line << "\n";
        }
    }
    show(dd);
}

void stod_error_break(std::string const&input)
//Use this if input is sure to have correct syntax.
{
    std::cout<<__func__<<":\n";
    VectorDouble dd;

    char const*d=input.data();
    while(true) {
        try {
            std::size_t eod;
            double d1(std::stod(d,&eod));
            d+=eod;
            arith_ops(d1);
            dd.emplace_back(d1);
        }
        catch (const std::invalid_argument&) {
            //Either syntax error
            //Or end of input.
            break;
        }
    }
    show(dd);
}

#include <boost/spirit/home/x3.hpp>
void x3_error_break(std::string const&input)
//boost::spirit::x3 method.
{
    std::cout<<__func__<<":\n";
    VectorDouble dd;

    auto f=input.begin();
    auto l=input.end();
    using namespace boost::spirit::x3;
    auto arith_action=[](auto&ctx)
      { arith_ops(_attr(ctx));
      };
    phrase_parse(f, l, double_[arith_action] % eol, blank, dd);
    show(dd);
}

int main()
{
    //stod_error_recov(input_err_yes);
    //stod_error_break(input_err_not);
    x3_error_break(input_err_not);
    return 0;
}

the stod_* functions, unlike that of Armin's, don't need regex because std:stod does the parsing and, because it doesn't use regex it probably runs a bit faster.

There are 2 stod_* functions shown with in-source comments indicated which should be used.

For completeness, a 3ird function using boost::spirit::x3 is shown. IMHO, it's readability is better than the others; however, it would probably take more time to compile.

Community
  • 1
  • 1
user1681377
  • 93
  • 1
  • 8