1

If I were to compile the following code with javac, would the compiler be clever enough to only evaluate codeInput.length() once, or would I potentially get a faster program by putting the evaluations affected by the iterator variable first?

// edit: The first character must be an upper case letter, the next four must be 
// edit: numeric and there must be exactly five characters in total in a product
// edit: code.

for (int i = 1; i < 5; i++)
    if (codeInput.length() != 5 || codeInput.charAt(i) < '0' 
        || codeInput.charAt(i) > '9' || codeInput.charAt(0) < 'A'
        || codeInput.charAt(0) > 'Z')
        {if (verbose) System.out.println("Sorry, I don't understand! 
            Use product codes only."); return true;}
Eamon Moloney
  • 1,853
  • 4
  • 19
  • 22
  • 1
    I am new to Java and programming in general and more curious as to the general mechanics than actually optimising babby's first shopping cart program. Your comment is non-constructive and I have flagged it as such: please do us all a favour and refrain from commenting on questions which you have no interest in or otherwise have little to contribute to. Thank you. – Eamon Moloney Jan 02 '13 at 09:37
  • @user1774214 Could you link to his answer? I don't see it. Also, how do you mark an answer as "unconstructive"? All I have is options for moderator attention, spam, and unwelcome-ness. – Waleed Khan Jan 02 '13 at 09:41
  • @user1774214, Hi comment has point. You do not express the background in your question. Therefore you have been given answer (comment) that this type of minor optimization is irrelevant in Java. And if you wold like to know why then write a proper question. And please do not be rude, simple thank you would be enaught. – Damian Leszczyński - Vash Jan 02 '13 at 09:43
  • @Gimby I'm sorry for getting angry, I just want to get into good habits, whether it be the habit of structuring my code properly (big picture) or small choices that I can worry about once then just do from then on. Again, I take back my harsh sentiments – I am honestly sorry. – Eamon Moloney Jan 02 '13 at 09:54
  • @Gimby - I don't think this is such a poor question. I appreciate the comments re. machine power and JIT capabilities, but it's useful/instructive to understand what's going on – Brian Agnew Jan 02 '13 at 09:54

6 Answers6

3

As a rule; you should write the simplest and clearest code you can and this will perform well. Often question of performance come up with making the code clearer is far more important. In your example I would write it this way.

static final Pattern CODE = Pattern.compile("[A-Z][0-9]{4}");

if (!CODE.matcher(codeInput).matches()) {
    if (verbose) 
        System.out.println("Sorry, I don't understand! Use product codes only.");
    return true;
}

Even if this is slightly slower, it is clearer and easier to maintain IMHO


The javac makes next to no optimisation, they are almost all performed by the JIT at runtime.

If I were to compile the following code with javac, would the compiler be clever enough to only evaluate codeInput.length() once,

No, but,

or would I potentially get a faster program by putting the evaluations affected by the iterator variable first?

It's unlike you will notice the difference as the JIT will kick in if the code is run long enough and the method will be inlined, removing the need for a local variable.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • +1. What I like about this is that it readily allows moving the pattern itself out of the source and into a configurable location (such as a resource bundle, property file, or database). I would even go as far as making a function to return the value of the regex; isolating knowledge of where the pattern is obtained. – Dave Jarvis Jan 03 '13 at 18:43
2

Your first target should be simplicity and readability of code. For a problem as simple as the one you present, you'd really have to go out of your way to turn it into a performance bottleneck. The cleanest, and most flexible, approach to string validation is regular expressions:

boolean valid = codeInput.matches("[A-Z][0-9]{4}");
if (!valid && verbose) 
   System.out.println("Sorry, I don't understand! Use product codes only.");
return !valid;

If you really want the last bit of performance from this, you can pre-compile the regex:

static final Pattern productCodeRegex = Pattern.compile("[A-Z][0-9]{4}");
boolean validate(String codeInput) {
  boolean valid = productCodeRegex.matcher(codeInput).matches();
  ...
}
Marko Topolnik
  • 195,646
  • 29
  • 319
  • 436
  • That logic is not the same as what I posted originally. – Eamon Moloney Jan 02 '13 at 10:04
  • Please point out the difference; I don't see it. Is it a fundamental difference, or just a touch-up that's missing? – Marko Topolnik Jan 02 '13 at 10:05
  • The first character must be an uppercase letter and the following four must be numbers. If I am reading that correctly, it describes five non-specific, uppercase alphanumeric characters. . It needs some way to split the string up; I have used charAt(). – Eamon Moloney Jan 02 '13 at 10:07
  • Yes, I see now. I've corrected the regex. This BTW serves as an excellent example of what I mean by readability and simplicity. – Marko Topolnik Jan 02 '13 at 10:12
1

Yes it will evaluate codeInput.length() in each iteration. Compiler may be able to make optimization if codeInput is final or accessible only to one thread at a time. Otherwise codeInput can be changed by another thread which current thread wont notice.

So it is the programmer decision to shift codeInput.length() outside the loop and take advantage of the the increased performance.

Subin Sebastian
  • 10,870
  • 3
  • 37
  • 42
1

The compiler has to create code that evaluates codeInput.length() on every iteration because it cannot know if it will return the same value on every iteration. You know it because you how the class works. But the compiler doesn't know it and has to assume that codeInput changes (e.g. due to the call of charAt()).

So yes: You can gain some performance by evaluating codeInput.length() once and assigning the result to a local variable.

Codo
  • 75,595
  • 17
  • 168
  • 206
1

It is not the compiler which will be clever here but the JVM.

Modern JVMs all have JITs (Just In Time), which means the code will be optimized on the go.

Do not bother trying to optimize your code in Java, just write correct code. The JVM will do the rest for you.

As for your particular piece of code, if you are using Guava, you can use a CharMatcher to good effect here:

private static final CharMatcher PRODUCT_CHARS
    = CharMatcher.inRange('A', 'Z')
        .or(CharMatcher.inRange('a', 'z'))
        .or(CharMatcher.inRange('0', '9'))
        .precompute();

// In the verification method:
if (!(codeInput.length() == 5 && PRODUCT_CHARS.matchesAllOf(codeInput)))
    badProductCode();
fge
  • 119,121
  • 33
  • 254
  • 329
0

Marko's answer is clearly among the best for the programming endeavor in this question.

However, using this as a general example from which one can extrapolate to contexts where there is no such better solution (as regexp or CharMatcher), as a matter of style and practice, I use and recommend simpler expressions, capturing what would otherwise be repeated method call results using local variables.

I assert that this improves clarity as we can name the local variables, and allows us to locate each simpler piece of code in the most logical place for execution (while also allowing the JVM to optimize one thing it is really good at, namely local variable usage).

You'll notice here in this transformed version, that we test firstChar outside the loop, instead of repeatedly inside the loop; same with .length() . I would first argue that this is logically more correct programming (it is confusing to other readers as to why this might be done over and over in the loop), and only second that it will have better performance.

While such code motion of the loop will not materially effect performance of this simple example (with a mere count of 5 iterations, and where there are better solutions using libraries that loop for us), in general and in other contexts, I recommend this as best practice, more readable, and cleaner (while also being performance oriented).

Also note, I test length first, so that .charAt(0) is known to exist, ie the length is > 0, meaning that we will reportError() on zero length string, instead of throwing some IndexOutOfBounds. This is another example of how using simpler expressions allows superior ordering of the programming logic.

This KIS (keep-it-simple) style also makes debugging easier as these local variables can trivially be watched. I think it is more readable for maintainers as well.

Also, confining the constant 5 to one place eases maintenance.

public static final int expectedLength = 5;

...

if ( codeInput.length() != expectedLength )
    return reportError ();

char firstChar = codeInput.charAt(0);
if ( firstChar < 'A' || firstChar > 'Z' )
    return reportError (); 

for (int i = 1; i < expectedLength; i++) {
    char nextChar = codeInput.charAt(i);
    if ( nextChar < '0' || nextChar > '9' )
        return reportError ();
}

...

private static boolean reportError () {
    if (verbose) 
        System.out.println("Sorry, I don't understand!\nUse product codes only."); 
    return true;
}
Erik Eidt
  • 23,049
  • 2
  • 29
  • 53