Ignore 'E' when reading double with sscanf

Question

I have input such as "(50.1003781N, 14.3925125E)" .These are latitude and longitude.

I want to parse this with

sscanf(string,"(%lf%c, %lf%c)",&a,&b,&c,&d);

but when %lf sees E after the number, it consumes it and stores it as number in exponential form. Is there way to disable this?

one option would be to break your input string down into substrings and scan from there. — dwcanillas, Apr 10 '15 at 14:26
You could replace the `E` with a place holder value and then change `b` or `d` to `E` if it has the placeholder. — NathanOliver, Apr 10 '15 at 14:26
Can you move so that all your longitudes are W instead of E? — Jonathan Leffler, Apr 10 '15 at 14:32
Not sure if I understand your question, i will be stroing these coordinates i read later, if its W i store is as negative double, if its E il lstore ti as positive number — lllook, Apr 10 '15 at 14:37
If you're not sure about my question, rest easy. It is mostly a joke. You probably can't afford to pretend that longitudes east do not exist. — Jonathan Leffler, Apr 10 '15 at 14:52
Maybe duplicate of http://stackoverflow.com/questions/29381290/c-extracting-double-from-stream — borisbn, Apr 10 '15 at 15:23
@borisbn: related, but that is a problem in C++ using the C++ I/O operators. This is primarily in C, though I see it does have the C++ tag. — Jonathan Leffler, Apr 10 '15 at 15:51
@JonathanLeffler I'm agree with you, but one of solutions in that question is puraly `C` and OP can use it — borisbn, Apr 10 '15 at 16:28
suggest storing lat/lon values like a gps would output them. I.E. separate each field with a comma so this input: (50.1003781N, 14.3925125E) would actually be stored as: (50.1003781,N, 14.3925125,E) — user3629249, Apr 11 '15 at 11:29
In April, I commented here on [SO 29381290](http://stackoverflow.com/questions/29381290/) noting that there were problems with the C++ code. Since making that comment (which I've now deleted), the code has been comprehensively fixed. But it remains C++ code. My answer here is C code; it can readily be upgraded to C++ code. (The primary problem is the `goto error;` statements skipping over `int c = toupper((unsigned char)*end++);` — and the secondary problems are the use of C-style casts such as `(unsigned char)*end++`.) — Jonathan Leffler, Aug 12 '15 at 05:50

Jonathan Leffler · Accepted Answer · 2015-04-11T23:04:35.203

I think you'll need to do manual parsing, probably using strtod(). This shows that strtod() behaves sanely when it comes up against the trailing E (at least on Mac OS X 10.10.3 with GCC 4.9.1 — but likely everywhere).

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    const char latlong[] = "(50.1003781N, 14.3925125E)";
    char *eptr;
    double d;
    errno = 0;      // Necessary in general, but probably not necessary at this point
    d = strtod(&latlong[14], &eptr);
    if (eptr != &latlong[14])
        printf("PASS: %10.7f (%s)\n", d, eptr);
    else
        printf("FAIL: %10.7f (%s) - %d: %s\n", d, eptr, errno, strerror(errno));

    return 0;
}

Compilation and run:

$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror latlong.c -o latlong
$ ./latlong
PASS: 14.3925125 (E))
$

Basically, you'll skip white space, check for an (, strtod() a number, check for N or S or lower case versions, comma, strtod() a number, check for W or E, check for ) maybe allowing white space before it.

Upgraded code, with moderately general strtolatlon() function based on strtod() et al. The 'const cast' is necessary in the functions such as strtod() which take a const char * input and return a pointer into that string via a char **eptr variable.

#include <ctype.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define CONST_CAST(type, value) ((type)(value))

extern int strtolatlon(const char *str, double *lat, double *lon, char **eptr);

int strtolatlon(const char *str, double *lat, double *lon, char **eptr)
{
    const char *s = str;
    char *end;
    while (isspace(*s))
        s++;
    if (*s != '(')
        goto error;
    *lat = strtod(++s, &end);
    if (s == end || *lat > 90.0 || *lat < 0.0)
        goto error;
    int c = toupper((unsigned char)*end++);
    if (c != 'N' && c != 'S')  // I18N
        goto error;
    if (c == 'S')
        *lat = -*lat;
    if (*end != ',')
        goto error;
    s = end + 1;
    *lon = strtod(s, &end);
    if (s == end || *lon > 180.0 || *lon < 0.0)
        goto error;
    c = toupper((unsigned char)*end++);
    if (c != 'W' && c != 'E')  // I18N
        goto error;
    if (c == 'E')
        *lon = -*lon;
    if (*end != ')')
        goto error;
    if (eptr != 0)
        *eptr = end + 1;
    return 0;

error:
    if (eptr != 0)
        *eptr = CONST_CAST(char *, str);
    errno = EINVAL;
    return -1;
}

int main(void)
{
    const char latlon1[] = "(50.1003781N, 14.3925125E)";
    const char latlon2[] = "   (50.1003781N, 14.3925125E) is the position!";
    char *eptr;
    double d;
    errno = 0;      // Necessary in general, but Probably not necessary at this point
    d = strtod(&latlon1[14], &eptr);
    if (eptr != &latlon1[14])
        printf("PASS: %10.7f (%s)\n", d, eptr);
    else
        printf("FAIL: %10.7f (%s) - %d: %s\n", d, eptr, errno, strerror(errno));

    printf("Converting <<%s>>\n", latlon2);
    double lat;
    double lon;
    int rc = strtolatlon(latlon2, &lat, &lon, &eptr);
    if (rc == 0)
        printf("Lat: %11.7f, Lon: %11.7f; trailing material: <<%s>>\n", lat, lon, eptr);
    else
        printf("Conversion failed\n");

    return 0;
}

Sample output:

PASS: 14.3925125 (E))
Converting <<   (50.1003781N, 14.3925125E) is the position!>>
Lat:  50.1003781, Lon: -14.3925125; trailing material: << is the position!>>

That is not comprehensive testing, but it is illustrative and close to production quality. You might need to worry about infinities, for example, in true production code. I don't often use goto, but this is a case where the use of goto simplified the error handling. You could write the code without it; if I had more time, maybe I would upgrade it. However, with seven places where errors are diagnosed and 4 lines required for reporting the error, the goto provides reasonable clarity without great repetition.

Note that the strtolatlon() function explicitly identifies errors via its return value; there is no need to guess whether it succeeded or not. You can enhance the error reporting if you wish to identify where the error is. But doing that depends on your error reporting infrastructure in a way this does not.

Also, the strtolatlon() function will accept some odd-ball formats such as (+0.501003781E2N, 143925125E-7E). If that's a problem, you'll need to write your own fussier variant of strtod() that only accepts fixed-point notation. On the other hand, there's a meme/guideline "Be generous in what you accept; be strict in what you produce". That implies that what's here is more or less OK (it might be good to allow optional white space before the N, S, E, W letters, the comma and the close parenthesis). The converse code, latlontostr() or fmt_latlon() (with strtolatlon() renamed to scn_latlon(), perhaps) or whatever, would be careful about what it produces, only generating upper-case letters, and always using the fixed format, etc.

int fmt_latlon(char *buffer, size_t buflen, double lat, double lon, int dp)
{
    assert(dp >= 0 && dp < 15);
    assert(lat >=  -90.0 && lat <=  90.0);
    assert(lon >= -180.0 && lon <= 180.0);
    assert(buffer != 0 && buflen != 0);
    char ns = 'N';
    if (lat < 0.0)
    {
        ns = 'S';
        lat = -lat;
    }
    char ew = 'W';
    if (lon < 0.0)
    {
        ew = 'E';
        lon = -lon;
    }
    int nbytes = snprintf(buffer, buflen, "(%.*f%c, %.*f%c)", dp, lat, ns, dp, lon, ew);
    if (nbytes < 0 || (size_t)nbytes >= buflen)
        return -1;
    return 0;
}

Note that 1 unit at 7 decimal places of a degree (10^-7 ˚) corresponds to about a centimetre on the ground (oriented along a meridian; the distance represented by a degree along a parallel of latitude varies with the latitude, of course).

score 4 · Answer 2 · answered Apr 10 '15 at 14:36

4

Process the string first using

char *p;
while((p = strchr(string, 'E')) != NULL) *p = 'W';
while((p = strchr(string, 'e')) != NULL) *p = 'W';

// scan it using your approach

sscanf(string,"(%lf%c, %lf%c)",&a,&b,&c,&d);

// get back the original characters (converted to uppercase).

if (b == 'W') b = 'E';    
if (d == 'W') d = 'E';

strchr() is declared in the C header <string.h>.

Note: This is really a C approach, not a C++ approach. But, by using sscanf() you are really using a C approach.

answered Apr 10 '15 at 14:36

Peter

35,646
4
32
74

thanks, and what would be c++ approach? this is part of c++ project I just couldnt think of anythinkg else then trying to use sscanf for this – lllook Apr 10 '15 at 14:39
4

Using `'W'` is probably not the best idea. What if I'm going west ? – Quentin Apr 10 '15 at 14:43
At the least, you need to record whether you changed `E` or `e` to `W` so that you can undo the change after using `sscanf()`. You probably want to record the original character so you can put the string back to how it was. Of course, this approach also means you can't scan string literals containing a latitude/longitude. – Jonathan Leffler Apr 10 '15 at 14:49
Yeah, okay. The point is changing the character in order to scan, and recording enough information so the values can be retrieved (e.g. multiply by -1 if changing E-W). A valid coordinate will not have distinct East and West coordinates. A whole swag of error checking would be needed both before and after the point of trying to scan for coordinates (checking the string only contains a pair of coordinates, checking return from sscanf(), etc etc). – Peter Apr 10 '15 at 20:32
A C++ approach would make use of the std::string type (and probably not name a variable as "string"). There are various operations supported by std::string to replace substrings, etc. For the scanning/parsing (instead of sscanf()), one way is to feed the string to a std::istringstream and use that to extract values. – Peter Apr 10 '15 at 20:36

score 0 · Answer 3 · answered Apr 10 '15 at 14:29

0

You can try to read all the string and then to replace E with another chracter

answered Apr 10 '15 at 14:29

Ignore 'E' when reading double with sscanf

3 Answers3

Linked