0

New to programming, I've held the (self imposed) notion that the less lines (of code), the better. Thus, when programming something, rather than having separate variables for pieces, I've been nesting all that I can into a line. For example:

preg_match('~><a href=~', substr(file_get_contents($match[1])), strpos($match[1], "help")), $match_rating)

Would it be more professional/generally-preferable to break the line above into separate variable chunks, so that it looks more like:

 preg_match($regExp, $bigString, $matches)

..With each variable/piece being defined above with its own line (and variable)?

I've wondered if this isn't really better or more efficient, since it seems like it would make it difficult for someone else to read and decipher it. I realize it's probably a personal preference type deal, but is there a generally held (professional) standard of which side to lean toward?


Attila
  • 28,265
  • 3
  • 46
  • 55
Coldblackice
  • 3,450
  • 4
  • 21
  • 22

3 Answers3

3

Your first and foremost goal should be producing readable code that statest your intent clearly (using common idioms).

Your second goal should be to ensure your program works as intended (keeping goal 1 in mind).

If you fulfilled the first to goals and you find that there is a performance/memory problem, then you need to think how you can make the code more efficient. If you get to this part the first step is to measure where the problem is; then fix that one part (possibly deviating from goal 1 above) -- rinse-and-repeat if there are still performance/memory issues.

In your particular example it is often beneficial to store partial expressions in variables (e.g. you can inspect them readily in a debugger or print them out when debugging). If you have a decent compiler, this should not impact efficiency much (the compiler can optimize your code if needed).

Also, it is preferable to keep lines short, so they are easy to read. Your first code sample should -- at least -- be broken into multiple lines (even if keeping the one expression)

If you use variables for storing partial expression results, choose names that indicate the role of that variable -- if there are multiple reg.ex's you could be using, chose a name that indicates which one that is. This way you will help your readers understand your intent better

Note: ordering of 1) and 2) is taste dependent, but if you go for "correctness" disregarding "readability", you will find yourself having a hard-to-understand, thus hard-to-debug and hard-to-maintain code

Attila
  • 28,265
  • 3
  • 46
  • 55
  • Well said. I started out in an era when code size, generally meaning source code size, was a primary concern. These days compilers are much, much better at optimizing code than we would probably be as programmers, so I now preach that source code should be readable and maintainable first and foremost, and one should let the compiler do the optimization. As a side note, with the compiler technologies available today, it should not be necessary for the programmer to know a great deal about how the compiler works (although it certainly helps one's understanding). – strings42 Jun 20 '12 at 19:25
1

I think the most important (in most cases) is that your code is readable and mantainable. So if you think that everybody is going to understand well what you're doing, without any problems, then let it the way it is.

I personally use variables only when I need to reuse them somewhere else, or when the expressions are very large and it becomes difficult to read without them. They also can act as "labels", to help to understand what the block of code is, or does (so comments are not necessary).

But creating a variable for every little thing is not good, it adds unnecessarily lines to your code. In your certain case I would not use them.

About efficiency... creating variables itself is not very CPU consuming, in most cases you are just creating references to an already existing objects in memory, and a reference doesn't occupy a lot of space (correct me somebody if I'm wrong...).

User
  • 31,811
  • 40
  • 131
  • 232
0

I disagree that less lines of code is better. If someone unfamiliar with the your problem domain was to look at your code, would they be able to understand it without having to dive on a bunch of different shortcut functions first?

If later on you need to do optimizations, it will be a lot clearer where 'the work' is being done as well. Loop dependency is not easily optimized by the compiler, but it is easy to see if you have written functions that work on arrays versus arrays of structures/classes that hold the values, though obviously working directly with arrays you're going to have more code than if you have obscured the computation via pointer misdirects or something like that.

As for your specific example, I think having variable declarations to hold values is perfectly acceptable, so long as I could go into preg_match and tell exactly what was being done with the values passed to the function. You would especially want to do the 2nd way if you were making repeated calls to preg_match with changing values, so you do not have to have an initialization happening for every function call.

Derek
  • 11,715
  • 32
  • 127
  • 228