6

I'm extending and improving a Java application which also does long running searches with a small DSL (in detail it is used for Model-Finding, yes it's in general NP-Complete).

During this search I want to show a small progress bar on the console. Because of the generic structure of the DSL I cannot calculate the overall search space size. Therefore I can only output the progress of the first "backtracking" statement.

Now the question: I can use a flag for each backtracking statement to indicate that this statement should report the progress. When evaluating the statement I can check the flag with an if-statement:

public class EvalStatement {
  boolean reportProgress;

  public EvalStatement(boolean report) {
     reportProgress = report;
  }

  public void evaluate() {
    int progress = 0;

    while(someCondition) {
      // do something

      // maybe call other statement (tree structure)

      if (reportProgress) {
         // This is only executed by the root node, i. e.,
         // the condition is only true for about 30 times whereas
         // it is false millions or billions of times
         ++progress;
         reportProgress(progress);
      }
    }
  }
}

I can also use two different classes:

  • A class which does nothing
  • A subclass that is doing the output

This would look like this:

public class EvalStatement {
  private ProgressWriter out;
  public EvalStatement(boolean report) {
      if (report)
        out = new ProgressWriterOut();
      else
        out = ProgressWriter.instance;     
  }
  public void evaluate() {
    while(someCondition) {
      // do something
      // maybe call other statement (tree structure)
      out.reportProgress(progress);
    }
  }
}

public class ProgressWriter {
  public static ProgressWriter instance = new ProgressWriter();
  public void reportProgress(int progress) {}
}

public class ProgressWriterOut extends ProgressWriter {
  int progress = 0;
  public void reportProgress(int progress) {
    // This is only executed by the root node, i. e.,
    // the condition is only true for about 30 times whereas
    // it is false millions or billions of times
    ++progress;
    // Put progress anywhere, e. g., 
    System.out.print('#');
  }
}

An now really the question(s):

  • Is the Java lookup of the method to call faster then the if statement?
  • In addition, would an interface and two independet classes be faster?

I know Log4J recommends to put an if-statement around log-calls, but I think the main reason is the construction of the parameters, espacially strings. I have only primitive types.

EDIT: I clarified the code a little bit (what is called often... the usage of the singleton is irrelevant here).

Further, I made two long-term runs of the search where the if-statement respectively the operation call was hit 1.840.306.311 times on a machine doing nothing else:

  • The if version took 10h 6min 13sek (50.343 "hits" per second)
  • The or version took 10h 9min 15sek (50.595 "hits" per second)

I would say, this does not give a real answer, because the 0,5% difference is in the measuring tolerance.

My conclusion: They more or less behave the same, but the overriding approach could be faster in the long-term as guessed by Kane in the answers.

H-Man2
  • 3,169
  • 20
  • 19
  • Method reflection is slow, but I'm not sure about method lookup compared to the if redirect. Have you tried profiling the two configurations? – Noah Oct 31 '11 at 18:47
  • Probably, you would want to rename one of your both reportProgress members - the method and the boolean flag - in the first version to something other than "reportProgress". – Frank Oct 31 '11 at 18:54
  • 1
    if you need some insight how virtual calls work, check: http://www.azulsystems.com/blog/cliff/2011-04-04-fixing-the-inlining-problem – bestsss Oct 31 '11 at 19:17
  • @bestsss: THX for the link, seems to be very interesting. – H-Man2 Oct 31 '11 at 19:25
  • @sqrfv: I'm currently collecting some runtime data and will post the results here. A think a real profiler is useless here. – H-Man2 Oct 31 '11 at 19:26
  • @H-Man2, indeed profilers are unpleasant nut to crack 2 reasons: if they insert code (to profile precisely) - they mess up the code layout and the inlining budget=bad. If they just periodically get stack traces they are entirely dependent on the safe-point the JVM inserts and get miss important loops (adding safe points in very hot loops is bad too), also uncool – bestsss Oct 31 '11 at 19:47
  • @sqrfv: I added some measured results. – H-Man2 Nov 03 '11 at 09:00
  • Its certainly a very interesting question. – Noah Nov 03 '11 at 14:05

5 Answers5

5

I think this is the text book definition of over-optimization. You're not really even sure you have a performance problem. Unless you're making MILLIONS of calls across that section it won't even show up in your hotspot reports if you profiled it. If statements, and methods calls are on the order of nanoseconds to execute. So in order for a difference between them you are talking about saving 1-10ns at the most. For that to even be perceived by a human as being slow it needs to be in the order of 100 milliseconds, and that's if they user is even paying attention like actively clicking, etc. If they're watching a progress bar they aren't even going to notice it.

Say we wanted to see if that added even 1s extra time, and you found one of those could save 10 ns (it's probably like a savings of 1-4ns). So that would mean you'd need that section to be called 100,000,000 times in order to save 1s. And I can guarantee you if you have 100 Million calls being made you'll find 10 other areas that are more expensive than the choice of if or polymorphism there. Seems sorta silly to debate the merits of 10ns on the off chance you might save 1s doesn't it?

I'd be more concerned about your usage of a singleton than performance.

chubbsondubs
  • 37,646
  • 24
  • 106
  • 138
  • 1
    Global variables are bad no matter what you call them. Plenty of blog ink has been spilled over the problems of singletons. Say you want to run two of these operations at once and track the progress of each. Using a singleton will ensure you can't do this by simply instantiating another object. When you create this object that's going to be executing this, pass it a reference to the object progress controlling progress, and you can elegantly change it when your requirements change. – chubbsondubs Oct 31 '11 at 19:17
  • Still even if this section is called 1 million times it only adds 100 ms. 100 million times for 1s. You can boost more performance out it by looking at your algorithm instead of things like this. As others have said IO will dominate the performance over this choice you're asking. – chubbsondubs Oct 31 '11 at 19:20
  • And I was being generous with the 10ns of savings. It's probably like 1ns which means you'd need 1 Billion calls to save 1s. Stop wasting your time and look at the areas on your profiling report before you look here is my point. – chubbsondubs Oct 31 '11 at 19:28
  • I know that there are other ways to reduce the overall runtime, like cut-and-branch. I also program with the idiom "You can make a good designed system fast, but not a fast system good." But I found this a more interesting question then "How do I search in an array". I'm really interested in the answer to my question. – H-Man2 Oct 31 '11 at 19:29
2

I wouldn't worry about this - the cost is very small, output to the screen or computation would be much slower.

Roman Goyenko
  • 6,965
  • 5
  • 48
  • 81
  • I know that the concrete output is much slower. That's exactly the point: The search algorithm hits the output part million of times, but only a small number of calls (normally 30 to 40) are producing some output. – H-Man2 Oct 31 '11 at 19:06
1

I would assume that method lookup is faster than evaluating if(). In fact, also the version with the if needs a method lookup. And if you really want to squeeze out every bit of performance, use private final methods in your ProgessWriter's, as this can allow the JVM to inline the method so there would be no method lookup, and not even a method call in the machine code derived from the byte code after it is finally compiled.

But, probably, they are both rather close in performance. I would suggest to test/profile, and then concentrate on the real performance issues.

Frank
  • 2,628
  • 15
  • 14
1

The only way to really answer this question is to try both and profile the code under normal circumstances. There are lots of variables.

That said, if I had to guess, I would say the following:

In general, an if statement compiles down to less bytecode than a method call, but with a JIT compiler optimizing, your method call may get inlined, which is no bytecode. Also, with branch-prediction of the if-statement, the cost is minimal.

Again, in general, using the interfaces will be faster than testing if you should report every time the loop is run. Over the long run, the cost of loading two classes, testing once, and instantiating one, is going to be less than running a particular test eleventy bajillion times. Over the long term.

Again, the better way to do this would be to profile the code on real world examples both ways, maybe even report back your results. However, I have a hard time seeing this being the performance bottleneck for your application... your time is probably better spent optimizing elsewhere if speed is a concern.

Kane
  • 4,047
  • 2
  • 24
  • 33
  • profiling poly/morphing is truly hard. it gets optimized depending the amount of the implementations available 1 is super fast and well inlined. 2 is bi-morph special case, also fast, 3..6 goes w/ inline caches, if possible and then it's true virtual call that can slow stuff down, prevents inline, etc. – bestsss Oct 31 '11 at 18:59
  • I never said it was easy. And you're right, there are a lot of variables. That being said, the only way to really know is to check. – Kane Oct 31 '11 at 19:08
1

Putting anything on the monitor is orders of magnitude slower than either choice. If you really got a performance problem there (which I doubt) you'd need to reduce the number of calls to print.

Patrick
  • 3,790
  • 1
  • 16
  • 12
  • I know that the concrete output is much slower. That's exactly the point: The search algorithm hits the output part million of times, but only a small number of calls (normally 30 to 40) are producing some output. – H-Man2 Oct 31 '11 at 19:06