1

Note: For wont of a better word I call the fluff at the start of source files --

/* @(#) $Id: file.c,v 1.9 2011/01/05 11:55:00 user Exp $
   **************************************************************************
   * COPYRIGHT, 2005-2011                                                   *
   ...
 */

-- Keyword Substitution comments, although I do not know if this is just a subversion term.

Anyway, now to the question: We have a 3rd party supplier that we get source code from. These c source all have these keyword subst comments, and every time we get a new version from the supplier, all (1000+) files are changed because they update these comments for every release they send us, even if no source code changes whatsover are made in these files, so the only change is the comments. Now, before we compile and use these sources, we would be interested in doing a cursory code review to see the areas that have been changed. (Never trust the release history). However, this is rather difficult, as doing a simple folder diff will obviously list all files.

What I'm looking for now is whether there already exist any simple tools to strip these special multi line comments from the source files. Maybe anyone has a link to a grep or sed script that will scratch that stuff from the files?

Martin Ba
  • 37,187
  • 33
  • 183
  • 337
  • 1
    Strip them from the diff output rather than from the source files. – Jim Balter Feb 15 '11 at 09:36
  • Like Jim Balter says: just find a way to ignore them during diffing. Most diffing tools should have some way to ignore certain patterns. – Otherside Feb 15 '11 at 10:20
  • @Jim, Otherside : Feel free to supply an answer describing a decent diff tool for windows that is capable of doing this. – Martin Ba Feb 15 '11 at 12:18
  • @Martin Any of a dozen scripting languages (including sed, but not including grep, which only searches) can be used to strip text specified by regular expressions from a file. I feel free to offer general advice, and I feel free not to write your scripts for you. – Jim Balter Feb 15 '11 at 12:27
  • @Jim: Seems we're talking cross-purposes. To do a code review, we generally use an interactive tool. I'm not sure if any such tools allow filtering the file content, which is what I understood what you and Otherside proposed. What advantage do you see in stripping the comments from the diff output instead of the source files? (Which diff output anyway -- which diff tool?) – Martin Ba Feb 15 '11 at 13:04
  • 1
    @Martin You said nothing in your "question" about using interactive tools. What you said was "a cursory code review to see the areas that have been changed" -- that could be done via diff --recur | striprcsids (diff and pipelines are available via Cygwin on Windows but I'm sure there are native Windows equivalents) and then a manual inspection, which is likely to be a lot faster than making a clone of the source tree with the rcsids stripped and then examining each file individually with an interactive diff tool. However, if you want to do that, consider Jens Gustedt's answer. – Jim Balter Feb 15 '11 at 13:13
  • @Jim: +1 - Good comment about how you would use diff. (This was not clear to me from your previous comments.) – Martin Ba Feb 15 '11 at 15:06

2 Answers2

1

Something like:

perl -ne 'if(m+/\*.*\$Id: +) $c = 1; print unless $c; if($c && m+\*/+) $c = 0;'

Note that this will work only if

  • such comments are delimited with /*...*/
  • on the first line there is $Id:
  • there is nothing after the */
  • there is no */ before the /*

And that it will strip all lines that are between start of comment and end of comment.

I have not tested it!

Benoit
  • 76,634
  • 23
  • 210
  • 236
0

First, I would try to convince them to review either their version control system (looks as if they use RCS, still?) or if that is not possible to have them hook up to a svn or git server for submitting their changes. But perhaps you already did?

If nothing in that sense is possible, I would try to set up a git repository to hold the versions that they supply to you. Git allows you to have filters when you are importing or exporting and also has support for ignoring such tags for deltas between versions.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177