3

I have a file such as:

 1.0000000e+01   8.0123000e+01   1.0000000e+01   1.0000000e+01   1.0000000e+01
-1.0000000e+01   1.0000000e+01   1.0001110e+01   1.0000000e+01   1.0000000e+01
 1.0000000e+01   1.0000000e+01  -5.0000000e+01   1.0000000e+01   1.0000000e+01
 //... (repeated scientific numbers)
 1 2 3 4
 2 4 5 60
 100 3 5 63
 //... (repeated integer numbers)

I would like to read these numbers from a file in C++, but only the numbers which are in scientific format, so I need the code to stop when the number format changes. I also have this advantage that the float numbers come in 5 columns, whereas integers come in 4 columns.

So, what's the best way to do that in C++?

Xeo
  • 129,499
  • 52
  • 291
  • 397
Biga
  • 317
  • 1
  • 3
  • 11

5 Answers5

2

Ignoring EOL (continues reading integers):

typedef double d[5] Datum;
Datum d;
vector<Datum> data;
while (true) {
  Datum t;
  istr >> t[0] >> t[1] >> t[2] >> t[3] >> t[4];
  if (!istr) break;
  data.push_back(t);
}

Using column count and EOL:

while (istr) {
  string line;
  getline(istr, line);
  Datum t;
  istringstream temp(line);
  temp >> t[0] >> t[1] >> t[2] >> t[3] >> t[4];
  if (temp.fail()) break;
  data.push_back(t);
}
Biga
  • 317
  • 1
  • 3
  • 11
Basilevs
  • 22,440
  • 15
  • 57
  • 102
  • Lines `typedef double d[5] Datum; Datum d; ` gives me compilation error, so I can't test that. Anyway, I get the idea and it seems perfect. – Biga Mar 31 '11 at 14:37
  • The following command `temp >> t[0] >> t[1] >> t[2] >> t[3] >> t[4];` does not read the floats correctly. Should it be formatted? – Biga Mar 31 '11 at 15:00
  • Not the floats, the number in scientific format! I guess it's not working with the scientific format. – Biga Mar 31 '11 at 15:19
  • Well that's pretty strange. Have you tried parsing your input with "e" being replaced with "E"? – Basilevs Apr 01 '11 at 07:19
  • In the second option, I didn't get how `temp` and `line` gets related to each other. – Biga Apr 01 '11 at 11:39
  • 1
    The error was that apparently line `istringstream temp;` should be `istringstream temp(line);`. So, there was no problem with number format actually. – Biga Apr 01 '11 at 12:06
0

You could use strstr to search for "e+" in each line.

http://www.cplusplus.com/reference/clibrary/cstring/strstr/

If you wanted to be more fancy you use a regular expression library (such as boost::regex) which would also help you extract the strings from each line.

persiflage
  • 1,154
  • 12
  • 22
0

I am afraid there is no direct way to do this. That is you can't stream in ( >> ) a float number in a specific format. So if you need that functionality, you must read the lines as strings, then manually parse them. Of course, this doesn't mean you have to build a float number digit by digit. Once you've established the boundaries of the input file from which you want to read float, use stringstreams to read them.

Armen Tsirunyan
  • 130,161
  • 59
  • 324
  • 434
0

You can use a regex to match only the ones you care : -?\d+\.\d+e[+-]\d+

I'm sure that this is not the best way but if performance is not a big issue it's an easy way out

Warning : Auto generated code from RegexBuddy

pcre *myregexp;
const char *error;
int erroroffset;
int offsetcount;
int offsets[(0+1)*3]; // (max_capturing_groups+1)*3
myregexp = pcre_compile("-?\\d+\\.\\d+e[+-]\\d+", 0, &error, &erroroffset, NULL);
if (myregexp != NULL) {
    offsetcount = pcre_exec(myregexp, NULL, subject, strlen(subject), 0, 0, offsets, (0+1)*3);
    while (offsetcount > 0) {
        // match offset = offsets[0];
        // match length = offsets[1] - offsets[0];
        if (pcre_get_substring(subject, &offsets, offsetcount, 0, &result) >= 0) {
            // Do something with match we just stored into result
        }
        offsetcount = pcre_exec(myregexp, NULL, subject, strlen(subject), 0, offsets[1], offsets, (0+1)*3);
    } 
} else {
    // Syntax error in the regular expression at erroroffset
}
Diadistis
  • 12,086
  • 1
  • 33
  • 55
  • Splitting lines in fields is by far more effective and simple. Moreover testing for digits with regexp is an overkill, there are just 10 of them after all (and you can find any of them with string::find_first_of()) – Basilevs Mar 31 '11 at 12:22
  • How would you separate in fields? – Biga Mar 31 '11 at 19:57
-1

regex is best way to do that here is alternatively you can try with fscanf()

Vijay
  • 2,021
  • 4
  • 24
  • 33