How bad should the code be?

Original author: Eric Lippert
  • Transfer
Eric Lippert is a Microsoft veteran who has worked at the company for 16 years and is behind the development of VBScript, JScript and C #.

Last week, in the comments on one of the articles, a debate broke out about the role of low-level optimization in programming, and I remembered Eric's article on this. It was written at the end of 2003, and although the realities have changed somewhat since then, the principles have remained the same. You can mentally replace ASP and VBScript with PHP, JavaScript, or another scripting language of your choice.

I already tried to translate this article in 2005, but then the Russian text turned out to be awkward, so this translation is new and has not been published before, in accordance with UFO requirements. The translation of Eric Lippert’s blog doesn’t have this text either - it’s probably too old for them.

I already wrote a lot about the speed of scripts, but so far I have not commented on the fact that I consider many tips on optimizing them to be at least stupid, or even frankly harmful.

For example, in seven years at Microsoft, I received dozens of questions, similar in essence to this, asked in the late 1990s:
We have VBScript code, and in one often called function we define Dimseveral variables with an operator that are not used anywhere in the function. Is every function call slowing down due to the declaration of these variables?
What an interesting question! In a compiled language such as C, declaring local variables with a total size of n bytes only subtracts n from the stack pointer when entering the function. If n is a little more or a little less, the time spent on subtraction will not change. Probably the same in VBScript? It turned out that no! Here is what I wrote to the author of the question:

Worthless analysis No. 1

Declared a variable - get a variable. How can VBScript know if a function is going to execute something like
Function foo()
    Dim bar
    Execute("bar = 123")

In order for such code to execute correctly, the VBScript engine is forced to store a list of names of all declared variables at run time. As a result, declaring each extra variable takes time for every function call.
Okay, but still, how much time is spent on each variable? It so happened that on that day my computer was set up for profiling, so that I could measure the time spent accurately:
On my machine, every extra variable slows every function call by 50 nanoseconds. The total slowdown grows linearly with the increase in the number of unnecessary variables, although I did not check the cases with thousands of unused variables, considering it unrealistic. In addition, I did not check cases with very long variable names: although VBscript restricts variable names to 256 characters, it is likely that declaring long names will take more time than declaring short names.

My machine is a Pentium III 927 MHz, i.e. the delay is about 50 measures (I cannot now measure the number of measures more accurately).

Thus, if you have, for example, five unused variables in a frequently called function, then every four million calls will slow down the program for a full second - and this is if the program runs on the same powerful machine as mine. If on a weaker machine, then the deceleration will be even stronger.

You do not mention whether this script is executed on the client or server. This is crucial for performance analysis!

The delay in declaring a variable is associated with the allocation of memory on the heap, so that it can grow non-linearly on the server - depending on the load of the heap by other threads. I measured only the direct processor time spent on declaring a variable; most likely, on an 8-processor heavily loaded server and in the vicinity of other threads that intensively use the heap - the total time spent on declaring a variable will be very different from those indicated.

And now I can tell you that all the above analysis of speed is not worth a penny, because it obscures a completely different problem. We do not notice the elephant in the middle of the room. There are two reasons why a user can figure out the effect of specific VBScript language constructs on program performance:
  1. This user is interested in the development and design of programming languages, and wants to exchange experiences;
    —Or much more likely
  2. This user is trying to optimize his program so that it runs faster. It is extremely important for him the speed of his program.

That's it! Now it’s clear why there is no sense in my research. If speed is so important for the user, then why does he write in a language with late binding, weak typing, with interpretation of non-optimized bytecode - in a scripting language specially designed for speed of development to the detriment of the speed of the finished code?

Useless analysis No. 2

If you want the script to run faster, then you should definitely not start by removing the 50 nanosecond delays. The main thing in optimization is to find the most time-consuming operation, and optimize it. For example, one call to a function using an undeclared variable will be hundreds of times more expensive than declaring an unused variable. One call to an external object will be thousands of times more expensive. In the same way, you can cut the lawn with nail scissors: you will spend a lot of your time, but you will not achieve any visible result. This is precisely the difference between “activity” and “productivity”. Work productively!

But it would be even better to throw away the entire script and write the program again in C, if its speed is so important.

Here I will kill myself - yes, the speed of the scripts is important, and we got a hell of a lot of time and effort so that our engine works as fast as possible to work with the scripting language with late binding, with weak typing, with dynamic code generation. It is hardly possible to speed up VBScript without turning it into a completely different language - only if you rewrite the entire engine from scratch. So the conclusion is not that “VBScript is a bad, slow language” - but that the tool must be chosen according to the task.

But wait, there's another elephant in this room, so my second research is as meaningless as the first. For a meaningful performance analysis, we lack one parameter - the most important:

How bad should the code be?

Hand on heart - I think the previous two "analyzes" are not just useless - they spoil programmers.

You, like me, probably saw a lot of advice like “it is better to use a number to check the parity And 1than Mod 2because the processor executes the command faster Andthan division” - as if VBScript operations were compiled into machine instructions. People choosing the operator to use based on such nonsense will write an unsupported, incorrect code. Their programs will not work correctly, and nothing worse can be worse than an incorrect program - regardless of speed.

If you want your code to work quickly - it doesn’t matter, in a scripting language or any other language - then let go of your ears all such tips on the speed of operators, as well as all the “analyzes” of the delay in declaring a variable. For the code to work quickly, there is no need for “little tricks” - you need to analyze the user's tasks, establish the requirements for the program, then carefully measure the performance of the program and make systematic changes until the requirements are achieved.
  1. Focus on user tasks and set clear performance requirements.
  2. Formulate these requirements formally. What exactly is critical? The number of completed tasks per second? Delay before starting output? Time to complete the withdrawal? Scalability?
  3. Measure the performance of the entire system, not the individual parts.
  4. Measure performance after any changes.

I know that people are used to optimizing scripts in a completely different way. A common idea of ​​optimization, which was formed, probably, back in the era of PDP-11 - that you need to hone individual lines of code, squeezing the optimal sequence of machine instructions from each. No, it is impossible to optimize web scripts in this way - this is not C, in which for each operator you can predict the number of cycles consumed. Look at the program as a whole, optimize specific time-consuming blocks - otherwise your “optimization” will not lead to anything.

But for this you need to know the performance requirements. Find out what exactlyimportant to your users. Client applications must be responsive - processing the data inside the application can take five minutes, it can take an hour, but button presses must be processed within 200 ms, otherwise the application will appear to freeze. Web applications should be much faster - the difference between 25 ms / request and 50 ms / request is 20 requests per second. But will the user notice the difference if a 10-kilobyte page opens 25 ms faster? A user with a 14 Kbps modem will certainly not notice.

When you have performance requirements, constantly measure current performance to understand how close you are to the goal. You won’t believe how many people asked me for advice on how to speed up their code, but you couldn’t answer the question of how muchit needs to be accelerated. If you don’t know how fast your code should be, you can continue to optimize it even until the end of the world.

It may well turn out that a scripting language is suitable for your requirements; then remember that a script is glue, and almost all the time spent on a typical script is not the execution of its code, but calls to methods of external objects. If you are forced to optimize a script, and you are sure that it should remain a script - look for major delays. Processing data in a script is bad, and sophisticated code is even worse. Do not pay attention to variable declarations, pay attention to external challenges: each saved call costs tens of thousands of microoptimizations.

And finally: the right code is better than fast. Write the code as simple as possible. Meaningful code is easier to understand and easier to maintain. Let us return to our first example with the “superfluous Dim”: it does not matter that for every superfluous it takes Dim50 ns. But an extra variable is garbage in the code. It will puzzle and confuse the programmer who will have to support this code. This is why extra variables should be removed.

Only registered users can participate in the survey. Please come in.

Your attitude towards low-level optimization:

  • 3.6% I write in YANU and I have every byte and every clock 91
  • 9.4% I write all the code in Java with an eye to what machine code each construct is compiled 233
  • 22.7% I check the resulting machine code only occasionally and only selected problem areas 559
  • 61.3% I don’t care at all about machine code, the executable for me is a “black box” 1510
  • 2.7% I do not write code in YP, but I write TK, flowcharts, UML diagrams, etc. 67

Also popular now: