0

I was reading up about branch predictions and cache misses, and have decided to read what every programmer should know about memory. But I'm not sure how much low level knowledge do I need to have to write performant code.

One classic example is accessing the memory in the wrong way, accessing column by column when it's row-major would cause a cache miss on every lookup.

I ran a benchmark of 10 000 by 10 000 matrix, and the correct order runs in 97ms while the wrong order runs in 1535ms.

While it's evident that accessing the memory in the right order is much faster, but by how much? In this example, it's only faster by 1.44s. So a program written by a programmer without such memory knowledge is probably only penalized by up to a few seconds.

I have 2 questions.

  • How much low level knowledge does a competent programmer need to have?
  • How much time saved through optimization is considered worthwhile? At least a minute shaved? Half a minute? Or any amount of time saved?

Some people think that it's totally not worth it to optimize just to save a few seconds or even milliseconds. But what do you all think?

  • 7
    Does you writing the optimized code cost your employer more money than the speed increase saves? That's really the only valid measure of what is worthwhile or not. True, this estimate is often hard to make, but it's the only correct one. Every hard-and-fast rule that involves quantities such as "half a minute" cannot possibly be correct. – Kilian Foth Nov 23 '15 at 11:00
  • 1
    Additionally to Kilian, it also depends on what type of a system you are developing. Lately I have been working with backend developers, whose task is to make backend service for multiple partners. And this service has to be fast. Therefore saving miliseconds is all we care about. If your aim is to make the application as fast as possible, saving time is everything, but if you are more focused on the result and just need to get it done, you are likely to expect some drawbacks. – Andy Nov 23 '15 at 11:06
  • It depends. Really can't answer this. – Pieter B Nov 23 '15 at 11:07
  • 2
    Possible duplicate of [Is it always wrong to optimize before profiling?](http://programmers.stackexchange.com/questions/63986/is-it-always-wrong-to-optimize-before-profiling) – gnat Nov 23 '15 at 11:32
  • @gnat It's actually quite a different question. That question is asking if it's always wrong to optimize first, while I want to know how much low level knowledge should I have. People these days are always saying modern hardware are very fast and we have compilers to optimize it for us, so we don't ever need such low level knowledge. – Lee Yik Jiun Nov 23 '15 at 12:52
  • @DavidPacker I'm also a backend person and am very much interested in finding out how fast the code can be. There are days that I think if half a sec improvement is really worth all the effort (because not many people cares about performance these days). Seeing that all your team cares about is just saving milliseconds makes me feel that this craft is still being appreciated. I feel much better now. – Lee Yik Jiun Nov 23 '15 at 13:04
  • 2
    @LeeYikJiun the problem is that you ask a very general question. what is a: "competent programmer"? I know guys that make millions while programming but will give you a glazing stare when you start about cache misses. Also, when it's worth it, I mean there is no general answer to that. In a windows environment, a buttonpress that takes 1 second is waaaaaaaay too long while when I'm building a report people don't care about 10 or 20 seconds more. I've even had instances where I had to "slow down" the process because my customers didn't believe the computer could be that fast. – Pieter B Nov 23 '15 at 13:22
  • 1. When it is for your private entertainment, every microsecond counts. 2. If it is for business, then if the money saved by running the code faster, or the money earned by selling more units of the faster application, outweighs the cost of the optimisation. – gnasher729 Nov 23 '15 at 14:17
  • @LeeYikJiun: _Everybody_ cares if it isn't fast enough. People only don't care if you change it from "fast enough" to "much faster than fast enough", but they care if you change it from "much too slow" to "too slow" or from "too slow" to "fast enough". – gnasher729 Nov 23 '15 at 14:19
  • @LeeYikJiun: "Hardware is fast and compilers optimise". That gives you a constant factor which may or may not be enough. If it's not enough, well, in my personal experience I very rarely had to optimise, but turning awfully bad code into reasonable code often helped a lot (like when you encounter a loop that opens n^3 files as has happened to me, and between n = 20 to n = 30 it turned from slow to unacceptable). – gnasher729 Nov 23 '15 at 14:22
  • It Depends On What You Are Doing. If you are doing the typical random web page garbage, no, it probably doesn't matter if it takes one second or two. If you are doing real-time image processing, where you have megapixel frames of video coming at you 30 or 60 times every second, you better believe accessing the DRAM in the correct order makes a difference. – John R. Strohm Nov 23 '15 at 14:32
  • A programmer's time costs money, and the customer's computer/hardware costs money. If a piece of software is run an average of 1000 times per user and has 1 million users, and 1 second of CPU time costs users an average of $0.000001; then making the software run 1 second faster would save users $1000. If a programmer costs $100 per hour then the "1 second faster" optimisation is worthwhile if it can be done in less than 10 hours. – Brendan Nov 23 '15 at 18:19
  • First, does it bother anyone, like a user or you? If not, you don't have a problem, so don't fix it. Second, don't look at absolutes, like minutes or seconds. Look at fractions, like 1%, 10%, or 90%. Saving a minute is worthless if it is spread out over a day, because it is less than 0.1%, and there are certain to be bigger savings elsewhere. – Mike Dunlavey Nov 23 '15 at 19:27

1 Answers1

2

Its not only a question of how long a piece of string is, but what that string is used for.

Sure, spending a month to optimise a second off your code execution time is probably worthless. You could have found some slower code that was easier to optimise, or add more features. But sometimes you know something is slow and you need to modify it to make it faster and you have to experiment a bit to achieve this - for example a colleague years ago spent several months implementing a caching mechanism to our product (that was far too slow at the time). It added a 10% increase to overall speed. Then we discovered the MS_DTS setup we were using was incredibly inefficient, changed it to DCOM and achieved a 100% speed increase. What was certain was that our product was far too slow and we all needed to spend a lot of time making it faster in every way.

The other aspect is how much time is saved cumulatively - shaving a fraction of a second off a query might sound useless, but if that query is used repeatedly and often by every user, those fractions are going to add up to a fairly worthwhile server performance improvement.

So its difficult to answer your question without knowing your specific circumstances. You just need to trade off the performance impact against the cost of development.It can be useful to give a "cost to the user" to make this trade off easier to calculate - saving a second on how long it takes the start menu to show might not sound much, but if you have a billion users it might be worth the effort.

gbjbaanb
  • 48,354
  • 6
  • 102
  • 172
  • Thank you for your answer. I don't really have any specific circumstance now and I understand that my question is quite vague and general, but I just want to gather opinions. I think a good way to think about it is not improvement per instance but the cumulative improvement. – Lee Yik Jiun Nov 23 '15 at 12:57