0

I am using ExamDiff to compare two *.csv files which have no spaces after commas. Numbers in the files contain between 2 and 8 decimal places but I only want to evaluate the first 3 digits after the decimal - anything beyond the thousandth place is insignificant.
ExamDiff allows you to use Regex to ignore certain parts of lines so I'm using: (\d{1,4}\.) to identify the number string (but also ignore it which is Ok in these cases).
Here's a sample line from the csv:

VQ000009,B2,B3,VV,12.0000,0.23,1.0000,1.0000000000,1357.421

And here's the comparable line in the new CSV:

VQ000009,B2,B3,VV,12.0000,0.27,1.0009,1.0000000000,1357.431

So, in this example the 0.23 and 0.27 would flag the 1.0000 and 1.0009 would not flag, and the 1357.421 and 1357.431 would flag

Sam
  • 487
  • 1
  • 10
  • 25
  • Can you cite more examples? Not getting clearly what you want to achieve. What 2-3 digits you want to ignore? – Rajeev Ranjan Sep 18 '17 at 17:04
  • @RajeevRanjan edited with another example. – Sam Sep 18 '17 at 17:10
  • `\d{1,4}\.\d{3}` - this pattern should help. It would capture only 3 places after decimal wherein no mismatch is allowed. – Rajeev Ranjan Sep 18 '17 at 17:16
  • Compare the 1st 3 digits after decimal from file1 with the 1st three digits after decimal from file 2. If they are equal, ignore; otherwise flag. Use the regex `^(\d{1,4}\.\d{3})\d*$` and test it against the corresponding values fetched from both files. Fetch the data from Group1 of the matches and compare. If they are equal, ignore them else flag them. Link: https://regex101.com/r/D3LXon/2 – Gurmanjot Singh Sep 18 '17 at 17:16
  • @RajeevRanjan your example is almost there...problem is that in ExamDiff the regex is the pattern to ignore. Maybe it can't be done in this tool. – Sam Sep 18 '17 at 17:50
  • I have no idea what you're trying to do in terms of comparison and regex, but you can concatenate your strings using a delimiter (such as `||`) and then check for matches on either side of the delimiter. For example you can use this regex `^(\d{1,4}\.\d{3})\d*\|{2}\1` and compare your values `12.3456789||12.34678`, `1.0000||1.0009`, `1.0000||1.0090`. In these examples, only the second one will match since the capture group is found in the second value (and first 3 decimals match). If you want the opposite results use this regex: `^(\d{1,4}\.\d{3})\d*\|{2}(?!\1)` – ctwheels Sep 18 '17 at 18:07
  • Thanks for help so far guys; I have edited again to explain but this may just be a limitation on ExamDiff. – Sam Sep 18 '17 at 19:14

1 Answers1

1

The web site isn't clear as to how much of the Boost library is supported, but if full PCRE is supported, you can use this to ignore:

(?<=\.\d{3})\d+

This says match any digits that are preceded by a . and 3 digits. Note that if you have something like VQ.123456 the 456 will match and be ignored. Stray . will cause issues.

NetMage
  • 26,163
  • 3
  • 34
  • 55