3

For example, there is the source:

void func1() {
    func3();
    if(qqq) {
         func2();
    }
    func4(
    );
}

It should be transformed to:

void func1() {
MYMACRO
    func3();
MYMACRO
    if(qqq) {
MYMACRO
         func2();
MYMACRO
    }
MYMACRO
    func4(
    );
MYMACRO
}

I.e. to insert "MYMACRO\n" at the end of each line where statement can be, only inside functions.

How to do it easily? Should I use regular expressions? What tools should I use?

For example, can gcc output all line numbers of all statement begins (or ends) inside functions?

@related How to tell gcc to instrument the code with calls to my own function each _line_ of code?

@related What profiler should I use to measure _real_ time (including waiting for syscalls) spend in this function, not _CPU_ one

Community
  • 1
  • 1
Vi.
  • 37,014
  • 18
  • 93
  • 148
  • 6
    What kind of abhorrent monster are you creating and why? – JoshD Oct 21 '10 at 17:03
  • 1
    At the end of each line in the original source file or at the end of each logical line (after trigraphs have been replaced and lines have been spliced)? What about inside of comments? What are you trying to accomplish? – James McNellis Oct 21 '10 at 17:03
  • What tools do you have available, or, to put this another way, what environment are you using? – David Thornley Oct 21 '10 at 17:19
  • @JoshD, I just want it to profile it at line level, but not gprof. The macro will probably print clock time. – Vi. Oct 21 '10 at 18:11
  • @Vi: yaknow.... they do make profilers... Very good ones. You don't have to reinvent the wheel... badly... with a huge amount of pain. As bta stated, you'll have to more or less make a parser for c++ to do this even close to right. Oh, and if you _print_ clock time, that'll ruin your run time and your results will be meaningless. – JoshD Oct 21 '10 at 18:16
  • @JoshD, OK, trying more profilers. – Vi. Oct 21 '10 at 20:33
  • 1
    @JoshD, Printing timestamps in relatively high-level function (only in one source file) should almost not affect performance. I sometimes do such things (temporarily insert macros at many places) manually. I wonder if there some auto things for this. – Vi. Oct 21 '10 at 20:36
  • 2
    @Vi: print to screen or to file? To screen will be pretty slow. Try this: make a loop that sums the values from 1 to 10000000 in a long long. Then do it again printing each value as you go. Compare run times... – JoshD Oct 21 '10 at 20:38
  • 1
    @JoshD, Values are printed in relatively high-level functions (that calls a lot of lower level things). Like "--gst-debug-level 2". Even if 100 messages are displayed on screen during our 6-second delay it can be useful. – Vi. Oct 21 '10 at 23:37
  • @Vi: perhaps there's a misunderstanding. Suppose you have a line that takes 1 nanosecond to complete. If you put a print line right after it, that print will take several hundred nanoseconds. If you then do this in a loop, the time you measure will not be the first line, it will be the print line because the time to execute the print line will be orders of magnitude greater than the line before it. Does that make sense? **The times you print won't reflect the running time of the present code, it will reflect the running time of the print functions.** – JoshD Oct 21 '10 at 23:43
  • 1
    @JoshD, the line can take seconds to complete (because of it calls heavy functions). I see a lot of function calls from the code, I don't know do they mean and I want to know which are heavy and which are lightweight. – Vi. Oct 22 '10 at 11:41
  • @JoshD, Of course, I'll not instrument _all_ functions in the program, only _some_ of them (for example, all functions in one source file. Or just one function). – Vi. Oct 22 '10 at 11:42
  • @Vi: I see. I got a different impression from your question. Never mind, then. I still suggest a profiler, though :) – JoshD Oct 22 '10 at 17:08
  • @JoshD, And I use profiler. Actually, this question is little obsoleted because of I got a bit more well with profilers. This thing was intended to be "poor man's profiler". – Vi. Oct 22 '10 at 17:37

3 Answers3

2

What are you trying to accomplish by doing this? Based on the description of the task, there is probably a much easier way to approach the problem. If you're sure that this is the best way to accomplish your task, read on.


You would have to implement some sort of rudimentary C language parser to do this. Since you are processing text, I would recommend using a scripting language like perl, python, or ruby to modify your text instead of writing a C program to do it.

Your parser will walk through the file a line at a time and for each line, it will determine whether it needs to insert your macro. The parser will need to keep track of a number of things. First, it needs to keep track of whether or not it is currently inside of a comment. When you encounter a /* sequence, set a "in comment" flag and clear it the next time you encounter a */ sequence. Whenever that flag is set, you will not add a macro invocation. Also, you will need to keep track of whether or not you are inside a function. Assuming your code is fairly simple and straightforward, you can have a "brace counter" that starts at zero, increments whenever you encounter a {, and decrements whenever you encounter a }. If your brace counter is zero, then you are not inside of a function and you shouldn't add a macro call. You will also want to add special code to detect and ignore braces that are part of a structure definition, array initializer, etc. Note that simple brace counting won't work if your code does more complicated things like:

void some_function (int arg) {
#ifdef CHECK_LIMIT_ONLY
    if (arg == 0) {
#else
    if (arg < 10) {
#endif
        // some code here
        ...
    }
}

While you could argue that snippet is simply a case of poorly-written code, it's just an example of the type of problem that you can run into. If your code has something in it that breaks simple brace counting, then this problem just got significantly more difficult. One way to tell if your code will break brace counting is if you reach the end of the file with a non-zero brace count or if at any point in time the brace count goes negative.

Once you can determine when you are in a function and not in a comment, you need to determine whether the line needs a macro inserted after it. You can start with a few simple rules, test the script, and see if there are any cases that it missed. For starters, any line ending in a semicolon is the end of a statement and you will want to insert a macro after it. Similar to counting braces, when you are inside of a function you will want to count parenthesis so that you can determine if you are inside of a function call, loop conditional, or other compound statement. If you are inside one of these, you will not add the macro. The other code location to track is the the start and end lines of a { ... } block. If a line ends in { or }, you will add a macro after it.

For a complicated task like this, you will definitely want to script something up, try it out on a relatively simple piece of code, and see what it gets wrong. Make adjustments to cover the cases you missed the first time and re-test. When it can parse the simple code correctly, give it something more complicated and see how well it does.

''Update:'' To address the concerns that some people have expressed regarding the additional latency of adding print commands, remember that you don't have to print a timestamp at every macro call. Instead, have the macro call grab a timestamp and stick it onto a list. Once your program is done, print all the timestamps off of the list. That way, you save all the print-related delay until after your test is over.

bta
  • 43,959
  • 6
  • 69
  • 99
  • 1
    For full generality, of course, it needs to include trigraphs. They can have significance here, such as a string like `"ab??/"cd}"` or the simple use of `??<` and `??>`, or `??=` to start preprocessor commands. Most programs don't use trigraphs, but I don't know what the OP is working with. – David Thornley Oct 21 '10 at 17:56
  • It turns out, OP is trying to profile the code line by line. Excellent answer btw, +1 – JoshD Oct 21 '10 at 18:32
  • 1
    I want to abstract out of syntax details. For example, compiler knows what lines are statements. Can it just tell it? – Vi. Oct 21 '10 at 20:34
  • @Vi: There may be multiple statements on one line and one statement may span multiple lines. – James McNellis Oct 21 '10 at 21:16
  • @James McNellis, Multiple statements in one line should be considered as one unit. It should handle multi-line statement and do not break it. – Vi. Oct 22 '10 at 11:43
  • @Vi- If you want to treat multiple statements on a line as one unit, then my algorithm should handle it that way. If you try to get the compiler to do the instrumentation for you, it will see them as separate statements. To combat the problem David brought up, I would recommend doing a global search-and-replace to replace all trigraph sequences with their associated characters before parsing the file. – bta Oct 23 '10 at 00:58
0

rewrite your sources so the following works :-)

Instead of gcc ... file1.c file2.c ... do

gcc ... `sed -e's/;/;\nMYMACRO/' file1.c` file1extra.c \
        `sed -e's/;/;\nMYMACRO/' file2.c` file2extra.c \
    ...
pmg
  • 106,608
  • 13
  • 126
  • 198
  • 1
    How would that work with the code `struct boo { int trick; float treat; };` or `for (i = 9; i < 99 ; i++) spank(i);` – nategoose Oct 21 '10 at 18:08
  • It wouldn't work with that code ... that's the reason for rewriting the sources in the first place: remove `struct` definitions, `for`, initializations, ... and everything else that stops the "`sed` trick" from working. – pmg Oct 21 '10 at 18:12
  • I want it to work in large existing source code that I don't yet understand fully, but need to fix/hack. – Vi. Oct 22 '10 at 11:44
0

Here's some quick and dirty C# code. Basically just primitive file IO stuff. It's not great, but I did whip it up in around 3 minutes. This code implies that function blocks are demarcated with a comment line of "//FunctionStart" at the beginning and "//FunctionEnd" at the end. There are more elegant ways of doing this, this is the fast/dirty/hacky approach.

Using a managed app to do this task is probably overkill, but you can do a lot of custom stuff by simply adding on to this function.

        private void InsertMacro(string filePath)
        {
            //Declrations:
            StreamReader sr = new StreamReader(filePath);
            StreamWriter sw = new StreamWriter(filePath + ".tmp");
            string line = "";
            bool validBlock = false;

            //Go through source file line by line:
            while ((line = sr.ReadLine()) != null)
            {
                if (line == "//FunctionStart")
                    validBlock = true;
                else if (line == "//FunctionEnd")
                    validBlock = false;


                sw.WriteLine(line);

                if (validBlock)
                   sw.WriteLine("MYMACRO");
            }

            //Replace legacy source with updated source:
            File.Delete(filePath);
            File.Move(filePath + ".tmp", filePath);

            //Clean up streams:
            sw.Close();
            sr.Close();
        }
kmarks2
  • 4,755
  • 10
  • 48
  • 77