2

Is there a way to get strptime() to handle fixed format time strings?

I need to parse a time string that is always on the fixed width format: "yymmdd HHMMSS", but with the complication that leading zeros are sometimes present and sometimes not.

Reading up on the man(3p) page of strptime I note that for all the conversion specifiers %y, %m, %d, %H, %M, %S it is commented that "leading zeros shall be permitted but shall not be required.". Hence I try the format specifier %y%m%d %H%M%S, naïvely hoping that strptime will recognize that spaces in the two substrings %y%m%d and %H%M%S are equivalent to (missing) leading zeroes.

This seem to work for the specifier %m, but not for %M (well, unless the second part is less than 10) as demonstrated by the following piece of code

#include <stdio.h>
#include <time.h>


int main() {
   struct tm buff;
   const char ts[]="17 310 22 312";
   char st[14];

   strptime(ts,"%y%m%d %H%M%S", &buff);
   strftime(st,14,"%y%m%d %H%M%S",&buff);

   printf("%s\n",ts);
   printf("%s\n",st);
   return 0;
}

When compiled and run on my machine outputs

17 310 22 312
170310 223102

Any insight on how to overcome this would be appreciated, or do I need to resort to manually chopping the string 2-characters at the time using atoi to convert to integers to populate my struct tm instance with?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
cpaitor
  • 423
  • 1
  • 3
  • 16
  • Would it be simpler to replace blanks where zeros are needed with zeros? Preprocess but make the input conform to the canonical requirements. – Jonathan Leffler Jun 12 '17 at 15:03
  • 2
    And even better, get the code that generates the wonky format fixed. – Jonathan Leffler Jun 12 '17 at 15:09
  • @JonathanLeffler, first of all thanks for cleaning up. I'm working on updating the code generating the input files (there's more than one), however these are old Fortran IV codes that last time we managed to compiled them they had to be mangled though f2c first. As the codes are packed with goto's at least every tenth line as well as such lovely things as jumps into do-loops and Hollerith constants, and further comes with between 10-20 versions of each source file (about 10-15 source files/executable), cleaning them up may take while... (programing archeology) – cpaitor Jun 12 '17 at 19:35

1 Answers1

1

It would be best to get the code that generates the data with the wonky format fixed.

Assuming that can't be done this morning, then maybe you should canonicalize (a copy of) the wonky data, like this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

static inline void canonicalize(char *str, int begin, int end)
{
    for (int i = begin; i <= end; i++)
    {
        if (str[i] == ' ')
            str[i] = '0';
    }
}

int main(void)
{
    struct tm buff;
    const char ts[] = "17 310 22 312";
    char st[32];

    char *raw = strdup(ts);

    printf("[%s] => ", raw);
    canonicalize(raw, 0, 5);
    canonicalize(raw, 7, 12);
    printf("[%s] => ", raw);
    strptime(raw, "%y%m%d %H%M%S", &buff);
    strftime(st, sizeof(st), "%y%m%d %H%M%S", &buff);
    printf("[%s] => ", st);
    strftime(st, sizeof(st), "%Y-%m-%d %H:%M:%S", &buff);
    printf("[%s]\n", st);
    free(raw);
    return 0;
}

The canonicalize() function converts replaces blanks with zeros over a given range of a string. Clearly, if you specify start and end points that are out of bounds, it will go trampling out of bounds. I preserved the const on ts and made a copy with strdup(); if you can treat the string as variable data, you don't need to make (or free) the copy.

The output from that code is:

[17 310 22 312] => [170310 220312] => [170310 220312] => [2017-03-10 22:03:12]
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Guess I could accept this solution as it solves my problem, however it does not answer the question asked in the first line of my post, do so (yes or no will do) and I'll accept your solution – cpaitor Jun 12 '17 at 19:38
  • Your question contains the answer to your question — there isn't a way to do it because there isn't anything else you can usefully use to control `strptime()` and the version on your system doesn't do what you want (and I don't think any other version is likely to behave significantly differently) so it won't work on unaltered data. This answer does agree with your suggestion that you need to do things differently, but suggests what is likely to be a simpler method of doing things differently. – Jonathan Leffler Jun 12 '17 at 19:40
  • Fine, Still amazed though that the logic seems to be different between `%m` and `%M` – cpaitor Jun 12 '17 at 19:43
  • @cpaitor With `" 3x"` where `x` is some digit character, `x` cannot be part of a month index via `"%m"`. With `"%M"`, `x` is part of the minute. Try ``" 12"` instead of `" 31"`. Not different logic so much as different acceptable ranges. – chux - Reinstate Monica Jun 12 '17 at 19:59