0

I want to use boost::regex to change the format of a bunch of dates on the format 19991231235959 to this format 1999-12-31_23:59:59 like this:

YYYYMMDDhhmmss --> YYYY-MM-DD_hh:mm:ss
19991231235959 --> 1999-12-31_23:59:59

I use this

std::string input = "19991231235959";
boost::regex regex("^([0-9]{4})([0-9]{2})([0-9]{2})([0-9]{2})([0-9]{2})([0-9]{2})");
std::string format = "\\1-\\2-\\3_\\4:\\5:\\6";
std::string output = boost::regex_replace(input, regex, format);

which works but is there a way to get rid of the repetitions ([0-9]{2}) in regexconstruction and keep the match groups?

Svaberg
  • 1,501
  • 1
  • 19
  • 40

2 Answers2

0

It depends on your input data. If you have a 14 digit timestamp at the beginning of every input line you could use:

boost::regex regex("^(....)(..)(..)(..)(..)(..)");
Alex H
  • 190
  • 1
  • 9
0

I am doing something similar. One thing I like to do is name my capture groups though.

boost::regex regex("\\D*(?<YYYY>\\d{4})\\D*(?<MM>\\d{2})\\D*(?<DD>\\d{2})\\D*(?<hh>\d{2})\\D*(?<mm>\\d{2})\\D*(?<ss>\\d{2})");

That gives each group the same name as your format string. So you can do things like:

int year = lexical_cast<int>(what["YYYY"]);
int month = lexical_cast<int>(what["MM"]);

Using the built in character classes also helps. The above regex will match every one of these inputs:

20160922015227
2016 09 22 01 52 27
2016_09_22-01:52:27