0

We once used borland starteam tool (one of the kind of revision/source code control system like mercurial) for our code management. Whenever we commit the code, the tool itself puts a description of the commit at the top of the file. So now we have many classes in the code where at the top of each file. For example:

/*This is some developer comment at the top of the file*/

/*
 * $Log:
 *  1   Client Name 1.0   07/11/2012 16:28:54  Umair Khalid did something
 *  2   Client Name 1.0   07/11/2012 16:28:54  Umair Khalid again did 
 *                                             something
 * $
 */

public class ABC
{
  /*This is just a variable*/
  int a = 0;
  public int method1()
  {
  }
}

Now i am planning to remove all this starteam type of the code which is present at the top of each file. But i dont want to remove any other comment from any file or any other copyright comment at the top. I only want to remove that chunk that starts with $Log and ends with $. I have looked at other questions as well related to this problem but this is a multiline comment. Would regular expression be good option for this?

Is there any utility i can use rather then writing my own code to remove this?

If regular expression is the only quick solution, then i am stuck in there.

Any help would be appreciated.

Umair Khalid
  • 569
  • 1
  • 4
  • 24
  • 2
    Use a Java parser instead of a Regex – Thomas Weller Nov 09 '18 at 23:26
  • 1
    Multiline comments are a pain to try to parse. Consider `/* stuff * more stuff /* surprise, there's a comment start inside a comment */`. If the pattern is always the same (as shown above), it's only mildly difficult, but in the general case, you really need a language parser. – Flydog57 Nov 10 '18 at 00:20
  • This Maven plugin https://www.mojohaus.org/license-maven-plugin/remove-file-header-mojo.html seems to be doing something very close to what you need. – yegodm Nov 10 '18 at 02:42
  • @umairkhalid: Was my answer useful? – Flydog57 Nov 11 '18 at 00:51

1 Answers1

1

If the format is exactly as you show, you could build a fragile little state machine that looks like this.

Start with an enum to track the state:

enum ParseState
{
    Normal,
    MayBeInMultiLineComment,    //occurs after initial /*
    InMultilineComment,
}

and then add this code:

     public static void CommentStripper()
     {
         var text = @"/*This is some developer comment at the top of the file*/
/*
 * $Log:
 *  1   Client Name 1.0   07/11/2012 16:28:54  Umair Khalid did something
 *  2   Client Name 1.0   07/11/2012 16:28:54  Umair Khalid again did 
 *                                             something
 * $
 */

/*
    This is not a log entry
*/

public class ABC
{
  /*This is just a variable*/
  int a = 0;
  public int method1()
  {
  }
}";

    //this next line could be File.ReadAllLines to get the text from a file
    //or you could read from a stream, line by line.

    var lines = text.Split(new[] {"\r\n"}, StringSplitOptions.None);

    var buffer = new StringBuilder();
    ParseState parseState = ParseState.Normal;
    string lastLine = string.Empty;

    foreach (var line in lines)
    {
        if (parseState == ParseState.Normal)
        {
            if (line == "/*")
            {
                lastLine = line;
                parseState = ParseState.MayBeInMultiLineComment;
            }
            else
            {
                buffer.AppendLine(line);
            }
        }
        else if (parseState == ParseState.MayBeInMultiLineComment)
        {
            if (line == " * $Log:")
            {
                parseState = ParseState.InMultilineComment;
            }
            else
            {
                parseState = ParseState.Normal;
                buffer.AppendLine(lastLine);
                buffer.AppendLine(line);
            }
            lastLine = string.Empty;
        }
        else if (parseState == ParseState.InMultilineComment)
        {
            if (line == " */")
            {
                parseState = ParseState.Normal;
            }
        }

    }
    //you could do what you want with the string, I'm just going to write it out to the debugger console.
    Debug.Write(buffer.ToString());
}

Note the lastLine is used because you need to read-ahead one line to pick up whether a comment is a log entry or not (which is what the MayBeInMultiLineComment state tracks).

The output from that looks like:

/*This is some developer comment at the top of the file*/


/*
    This is not a log entry
*/

public class ABC
{
  /*This is just a variable*/
  int a = 0;
  public int method1()
  {
  }
}
Flydog57
  • 6,851
  • 2
  • 17
  • 18