29

This may seem like an odd question to some of you.

I'm a hobbyist Java programmer. I have developed several games, an AI program that creates music, another program for painting, and similar stuff. This is to tell you that I have an experience in programming, but not in professional development of business applications.

I see a lot of talk on this site about performance. People often debate what would be the most efficient algorithm in C# to perform a task, or why Python is slow and Java is faster, etc.

What I'm trying to understand is: why does this matter?

There are specific areas of computing where I see why performance matters: games, where tens of thousands of computations are happening every second in a constant-update loop, or low level systems which other programs rely on, such as OSs and VMs, etc.

But for the normal, typical high-level business app, why does performance matter?

I can understand why it used to matter, decades ago. Computers were much slower and had much less memory, so you had to think carefully about these things.

But today, we have so much memory to spare and computers are so fast: does it actually matter if a particular Java algorithm is O(n^2)? Will it actually make a difference for the end users of this typical business app?

When you press a GUI button in a typical business app, and behind the scenes it invokes an O(n^2) algorithm, in these days of modern computing - do you actually feel the inefficiency?

My question is split in two:

  1. In practice, today does performance matter in a typical normal business program?
  2. If it does, please give me real-world examples of places in such an application, where performance and optimizations are important.
Aviv Cohn
  • 21,190
  • 31
  • 118
  • 178
  • 6
    Related: [Why do so many developers believe performance, readability, and maintainability cannot coexist?](http://programmers.stackexchange.com/questions/111021/why-do-so-many-developers-believe-performance-readability-and-maintainability) – Péter Török Aug 18 '14 at 09:42
  • Performance matters if it's poor. – Mike Dunlavey Aug 19 '14 at 13:26

15 Answers15

44

Yes. Yes, it does. Run-time speed isn't the only concern you should be having, and it's not as pressing as it was in 1982, or as it still is on low-powered embedded systems, but it's always a concern, and it's important that you understand why this is so.

For one, the asymptotic complexity you mention describes a program's behaviour as its input size grows. A non-linear program that deals with 10 items can get away with doing superfluous work, but it will bite you when one day you have to deal with 1000, because it won't just seem slower, but much, much slower. And you don't know (without extensive analysis and benchmarking) whether that point will be at 100 items, at 1000 items, or not until you hit 100,000 items. It may be hard to believe, but choosing the best algorithm as a matter of course is in fact a lot easier than estimating this point for every routine and choosing your implementation depending on this estimate.

Also, please read up on user experience basics. There are well-researched thresholds that determine how the interaction with a program is perceived depending on its response times (10ms, 100ms, a few seconds etc.). Crossing one of these thresholds will cause users to disengage from your application, and unless you are in the happy position of writing monopoly software that people have to use, disengaged users translates directly to negative business value because it leads to loss of customers.

These are only a few of the reasons why a professional programmer must know about algorithmic complexity and handle it responsibly. These days it's usually not necessary to go out of your way and program especially optimized, badly-readable code for anything unless it's turned out to be time-critical inner loop, but you should never, ever invoke a complexity class higher than is obviously necessary to do the job.

Kilian Foth
  • 107,706
  • 45
  • 295
  • 310
  • 2
    One other thing to keep in mind re algorithm choice: due to libraries and abstractions, a lot of algo choices have been made already for you or at least "under the hood". You should still know the implications of them on performance. *And that performance does matter*. – joshin4colours Aug 18 '14 at 15:12
40

You're right, performance in business apps is not really an important subject the way it is discussed by most programmers. Usually, performance-related discussions I hear from programmers have several issues:

  • They are mostly premature optimization. Usually, someone wants "the fastest way" to do an operation with no apparent reason, and ends up either making code changes which are made by most compilers anyway (such as replacing division by multiplication or inlining a method), or spending days making changes which will help gaining a few microseconds at runtime.

  • They are often speculative. I'm glad to see that on Stack Overflow and Programmers.SE, profiling is mentioned frequently when the question is related to performance, but I'm also disappointed when I see two programmers who don't know what profiling are discussing about performance-related changes they should do in their code. They believe the changes will make everything faster, but practically every time, it will either have no visible effect or slow the things down, while a profiler would have pointed them to another part of the code which can easily be optimized and which wastes 80% of the time.

  • They are focused on technical aspects only. Performance of user-oriented applications is about the feeling: does it feel fast and responsive, or does it feel slow and clunky? In this context, performance problems are usually solved much better by user experience designers: a simple animated transition may often be the difference between an app which feels terribly slow and the app which feels responsive, while both spend 600 ms. doing the operation.

  • They are based on subjective elements even when they are related to technical constraints. If it's not the question of feeling fast and responsive, there should be a non-functional requirement which specifies how fast an operation should be performed on specific data, running on specific system. In reality, it happens more like that: the manager tells that he finds something slow, and then, developers need to figure out what does that mean. Is it slow like in "it should be below 30 ms. while currently, it wastes ten seconds", or slow like "we can maybe lower the duration from ten to nine seconds"?

Early during my career as a programmer, I was working on a piece of software for a bunch of my customers. I was convinced this software is the next great thing which will bring happiness to the world, so I was obviously concerned by performance.

I've heard terms such as "profiling" or "benchmark", but I didn't know what they mean and couldn't care less. Moreover, I was too focused reading the book about C, and especially the chapter where optimization techniques were discussed. When I discovered that computers perform multiplication faster than division, I replaced division by multiplication anywhere I could. When I discovered that calling a method can be slow, I combined as much methods as I could, as if the previous 100 LOC methods weren't already an issue.

Sometimes, I spent nights doing changes which, I was convinced, made difference between a slow app nobody wants, and a fast one everybody wants to download and use. The fact that two actual customers who were interested by this app requested actual features wasn't bothering me: "Who would want a feature, if the app is slow?"—I thought.

Finally, the only two customers stopped using the app. It wasn't amazingly fast despite all my efforts, mostly because when you don't know what indexes are and your app is database-intensive, there is something wrong. Anyway, when I was doing just another performance-related change, which was improving by a few microseconds the execution of code which is used once per month, customers didn't see changes. What they were seeing is that user experience is terrible, documentation is missing, crucial features they were requesting for months were not here and the number of bugs to solve was constantly growing.

Result: I hoped this app will be used by thousands of companies around the world, but today, you won't find any information about this application on the internet. The only two customers abandoned it, and the project was abandoned as well. It was never marketed, never publicly advertised, and today, I'm not even sure I can compile it on my PC (nor find the original sources). This wouldn't have happened if I was focusing more on things that actually matter.

This being said, performance in general is important:

  • In non-business apps, it can become crucial. There is embedded software, software ran on servers (when you have a few thousand of requests per second, which is not that big, performance starts to be a concern), software ran on smartphones, video games, software for professionals (try to handle a 50 GB file in Photoshop on a not very fast machine to be convinced) and even ordinary software products which are sold to lots of people (if Microsoft Word spends twice its time to do every operation it does, the time lost multiplied by the number of users will become an issue).

  • In business apps, there are many cases where an application which feels and is slow will be hated by the users. You don't want that, making performance on of your concerns.

Arseni Mourzenko
  • 134,780
  • 31
  • 343
  • 513
  • 4
    Great answer, especially because of putting the focus on pointless perfomance discussions and pointless optimizations. – Doc Brown Aug 18 '14 at 10:00
  • 1
    `a simple animated transition may often be the difference between an app which feels terribly slow and the app which feels responsive` -- although these should certainly be used sparingly, for the apps which litter animations and transitions everywhere can be frustrating if staring at those transitions on a daily basis! – Cosmic Ossifrage Aug 18 '14 at 14:42
  • what's the source of your quote? – Adam Johns Aug 18 '14 at 16:43
  • 1
    @AdamJohns: no source. It's quoted from the drafts of my own articles which are not yet published on my blog. – Arseni Mourzenko Aug 18 '14 at 17:39
  • @MainMa Oh awesome. I really enjoyed that little illustration of your point. – Adam Johns Aug 18 '14 at 18:14
14

Yes it does!

Since you asked for examples, several every-day situations come to mind:

  1. Handling big-data: Many business-applications are backed by data-bases and in many instances these data-bases overflow with data. And since drive-space is cheap, the amounts of recorded and stored data are insane. Just last week a customer complained, that his application is sooo slow, when just displaying some average numbers (queries over some million rows...) - Also in everyday use we have batch-data conversions and calculations with runtimes in the league of several hours. Last year an algorithmic optimization brought the process time of one batch run down from 8 to 4 hours, now it doesn't collide with the day shift anymore!

  2. Responsiveness: There have been usability studies (if I have time I will add links to the relevant questions on ux.se...) that user satisfaction is heavily related to responsiveness. A difference in a response-time of 200ms vs. 400ms can easily cost you a big percentage of your customers leaving you for your competitors.

  3. Embedded Systems: Computers are not getting faster, they are getting slower and smaller ^_^ Mobile development has a huge impact on application development. Sure we can throw around memory and CPU cycles like Jelly Beans on modern Desktop computers, but now your boss asks you to implement the sleep-analyzing algorithm on a freakin watch or on a SIM-Card...

Falco
  • 1,293
  • 8
  • 14
4

In practice, today does performance matter in a typical normal business program?

I don't know what is a typical normal business program. What I do know, is that users always end up by feeding our programs with much more data than we planned (often after aksing them how big it would be and adding a security margin) and that in those case, they expect a linear increase of run-time, accept a n log n behavior and complain that the application freezes when anything else happens. And they tend to consider the size of the result more than the size of the input excepted in case where it is obvious from their POV that all input data has to be processed.

So yes, performance, at least at the complexity level, matters. Micro-optimization inside a complexity class does not really matter, excepted if you are visibly worse than the competition (either in benchmarks in some markets or by raw perception -- changing the class in the progression "instantaneous", "not instantaneous but user doesn't switch to something else", "slow enough that the user switch to something else at the risk of interruption the flow of actions", "slow enough that the user launch the task and then check from time to time", "slow enough that the user plan to launch the task over lunch, over night, over the week-end").

AProgrammer
  • 10,404
  • 1
  • 30
  • 45
4

In modern business applications, the performance problems are not in form of lack of CPU or memory. But they are in form of network latencies, I/O performance and abstractions hiding all of those. It all depends on how good the design and how experienced the developers are. Even simple CRUD application can crawl to a halt if it is pulling from DB one row at a time instead of running a query (also known as N+1 problem).

The problem is that having good design and experienced developers is expensive. And it is usually much cheaper to have irritated users than investing into actual performance optimizations. There are some cases, where customers will require high performance(eg. web browsers), but those rarely apply to common business applications.

Euphoric
  • 36,735
  • 6
  • 78
  • 110
3

Bear in mind that for server based applications, you may have hundreds, thousands or even millions of users trying to do things at the same time. A small saving in efficiency in such a situation can have a major impact on the amount of hardware required to service requests.

Jaydee
  • 2,667
  • 1
  • 18
  • 17
  • 5
    Actually most constant factors are better solved by throwing more hardware at the problem, because more hardware is usually cheaper than more time optimizing the thing. The problem is bad asymptotic (scaling) behaviour, because throwing more hardware won't help much with that. – Jan Hudec Aug 18 '14 at 08:31
  • 3
    You only optimise once, but you pay the electricty bill every month. – Jaydee Aug 18 '14 at 08:32
  • 2
    @JanHudec: I don't quite see how you can really say that with a straight face when the very website you're currently on (our dear Stack Exchange) serves 560M page views a month across the world [runs on just 25 servers](http://highscalability.com/blog/2014/7/21/stackoverflow-update-560m-pageviews-a-month-25-servers-and-i.html). – user541686 Aug 18 '14 at 11:31
  • 2
    @Mehrdad: And they could have written it in C instead and perhaps ran it on 20 servers instead of 25. But they didn't because the saving would not outweigh the increased development time. Many web services are implemented in Python and PHP, some of the slowest languages in general use, yet nobody thinks of rewriting them in anything faster because the increased development time would not pay off. _Constant_ factors are most of the time solved by just throwing more hardware at it. Scaling (asymptotic) problems is another matter of course. – Jan Hudec Aug 18 '14 at 11:49
  • 1
    @JanHudec Writing the site in C is very different from making sure you have the correct indexes on your database. And yes some sites (eg Facebook) do rewrite sections in more efficient languages. – Jaydee Aug 18 '14 at 11:53
  • 3
    ...And to be fair, the database, which is what's doing most of the grunt work, was written and optimized to go fast. – Blrfl Aug 18 '14 at 12:55
  • @JanHudec not quite true - its just that the up-front costs far outweigh the continued costs *in the minds of those getting paid to produce it*. So its easier to ship something cheaply and quickly, even if its rubbish. The user has to suffer with it, but the salesman who sold it no longer cares - he's off spending his commission. (I guess same applies to the startup that needed to get to market first). In most cases, the crappy code that was knocked up quickly gets replaced with something well designed later and doesn't cost massively in maintenance spend... oh no, wait... – gbjbaanb Aug 18 '14 at 16:11
3

It certainly matters a lot.

The main issue is not even about being annoying for the user, such as experiencing needless delays when GUI elements are overdrawn twice or thrice (which does matter on integrated graphics!) or simply because the program takes so long to do... whatever it does (mostly uninteresting stuff). Although of course, that is also an issue.

There are three important misconceptions:

  1. Most typical business computers are not "so much more powerful". The typical business computer is not a 8-core i7 with a kick-ass graphics card and 16GB of RAM. It's a notebook with a mid-class mobile processor, integrated graphics, 2GB of main memory (4GB if you are lucky), a 5400RPM disk, and an enterprise version of Windows with a variety of realtime antivirus and policy-enforcing software running in the background. Or, for most consultants, the "computer" is simply the iPhone...
  2. Most "typical business users" are not technicians, they do not understand the implications of creating a spreadsheet with 10-12 cross-referencing tabs, 150 columns, and 30,000 rows (these figures are not as unrealistic as you may assume!) and they do not want to know either. They will just do it.
  3. A second lost doesn't cost is a blatantly wrong assumption.

My wife works at the upper end of such a "typical business environment". The computer that she is using costs about as much as 3.5 hours of her working time are worth. Starting Microsoft Outlook takes -- on a good day -- about 3 minutes until it's ready (6-8 minutes on quarter-end when the servers are under heavy load). Some of those 30k-row-spreadsheets mentioned take 2-3 seconds for updating a value during which the computer is "frozen" (not to mention how long it takes Excel to start up and open them in the first place!). It's even worse when sharing desktop. Don't even get me going on SAP.
It certainly matters whether a hundred thousand people each lose 20-25 minutes per working day waiting for nothing. Those are millions lost which you could, instead of losing them, pay out as dividends (or pay higher wages).
Sure, most of the employees are on the lower end of the payscale, but even on the lower end time is money.

Damon
  • 253
  • 1
  • 3
3

I can understand why it used to matter, decades ago. Computers were much slower and had much less memory, so you had to think carefully about these things.

You seem to underestimate just how quickly N^2 grows. Lets say we have a computer and our N^2 algorithm takes 10 seconds when N = 10. Time passes we now have a new processor that is 6 times faster than our original so our 10 second computation is now less than two seconds. How much larger can N be and still fit in that original 10 second run time? We can now handle 24 items, a little over twice as much. How much faster would our system have to be to handle 10 times as many items? Well it would have to be 100 times faster. Data grows pretty fast and more than wipes out computer hardware progress for N^2 algorithms.

stonemetal
  • 3,371
  • 16
  • 17
  • Another example: If processing one element takes 30 CPU cycles or 10ns (which is quite cheap), the algorithm will already take a full second if you only have 10000 elements. 10000 elements isn't much in many contexts. – CodesInChaos Aug 18 '14 at 15:26
1

You wouldn't believe the amount of 3rd party business programs we use at work, and many of them are just ridiculously slow to use compared to my personal standards. If the programs were something I use at home, I would have replaced them with an alternative one a long time ago.

In some cases, the difference goes directly into the costs, as some programs directly affect how many tasks I can accomplish during a day, and thus lowers my productivity and amount billable items. So I would say it is quite important for business programs, too, to be at least performant enough as not to be the limiting item for income.

An example is incident management where the work is measured in 15 minute intervals (service desk). If the program is slow enough to push one ticket to take more than 15 minutes (incl. the actual work), it will slow down the process quite a lot. One cause could be a slow database access that simply "waits for a while" whenever the user does an action (fill resolution details, update work info, or similar). I can imagine there are cases where slow programs can even affect more critical things, such as hospital patient details on urgent poisoning cases - maybe medicine allergies or such?

Juha Untinen
  • 983
  • 6
  • 14
1

Many of the other answers cover the topic quite thoroughly, so I defer to them on the reasons and rationales. Instead, I want to give a real-life example to show how an algorithmic choice can have real implications.

http://windowsitpro.com/windows-xp/svchost-and-windows-update-windows-xp-still-problem

The linked article describes a bug in the algorithm to calculate Windows XP updates. For most of the life of Windows XP, the update algorithm worked fine. The algorithm calculates whether a patch has been superseded by a newer patch, and thus does not need installed. Towards the end, though, the list of superseded updates grew very long*.

The update algorithm was exponential, where each new update took twice as long to calculate as the previous one (O(n) = 2n). When the lists got up into the 40 updates range (*long), it took up to 15 minutes, running at full capacity, to check for updates. This effectively locked up many XP machines during the update. Worse yet, when one would go to install the updates, the algorithm would run again, taking another 15 minutes. Since many machines had Automatic Updates set, this could lock up the machines for 15 minutes at every boot, and potentially again at a certain periodicity.

Microsoft used both short-term hacks (removing items from the update list) and long-term fixes to address this issue. This was important because the latest versions of Windows were also using the same algorithm, and might one day face the same problem.

Here we can see that the choice of an algorithm had real implications. The wrong algorithm, while fine for most of the life of the product, can still have negative impacts down the road.

cbojar
  • 4,211
  • 1
  • 17
  • 18
0

I think you're interpreting the amount of questions asked about performance as an indication that performance requirements for business apps are important instead of recognizing that improving performance is difficult. Just getting it to work can be accomplished by brute-force techniques, trial and error or copying and pasting example code.

Anyone can get lucky and keep making changes until something runs faster, but that rarely works. Due to a lack of experience, developers will see outside help. In some environments, performance improvements are a unique problems, so asking a specific question on a site like StackOverflow may be the only option. Also, many consultants make their money by being able to step in and fix these kinds of problems.

JeffO
  • 36,816
  • 2
  • 57
  • 124
-1

It depends heavily on how you define "good performance". Your algorithms should always use the best possible complexity. Abuse loopholes to speed up your average cage. Buffer and preload/precompile wherever possible in an interactive program.

There is another definition of "good performance": Optimizing the constants of your complexity class. This is where C++ gets his title, where people start to call Java slow, where 5% less runtime seems to be the holy grail. Using this definition you're right. Computer hardware gets more complicated with time, while compilers get better and better. At some point you can't really optimize low end code better than the compiler - so just let it be and concentrate on your algorithms.

At that point using Java / Haskell / C++ becomes just another design decision. Number crunching can be done via OpenCL on your GPU. User Interfaces need constant time algorithms and they're good. Output and modularity is more important than aligning your classes for 20% better cache utilization. Multithreading becomes important, because people don't want a fast app, they want a responsive one. What is not important is your app being constantly 10% slower than it could be. Even 50% is fine (but people start asking questions then). Concentrate on correctness, responsiveness and modularity.

I love programming in Haskell or at least in a functional form (even in C++). Being able to write tests easily for your entire program is just way more important than being a little faster in batch jobs.

5-to-9
  • 105
  • 2
-1

Quite simple: cost

My previous employer had a learning management system that was hosted on physical servers as a SaaS model. The JVM's heap was configured to 2 GB for the older machines and 3 GB for the newer machines and we ran several instances per machine. You'd think that would be enough.

Before I started, there was a performance team responsible for making the system responsive and scale. They found that there were certain pieces of data that we queried from the database constantly. There was one table we even joined in to most queries to retrieve one column. That data rarely changed.

The trouble is, we had 2 GB to work with. So the obvious solution is to start caching all of the frequently read data. Then we had memory issues, starting right before I came on board.

There were 2 schools of thought on this:

  1. Memory and hardware in general is cheap these days. Just buy more RAM so you can cache more.
  2. Why does a learning management system need 3+ GB of RAM? All it does it issue SQL queries, send redirects to launch courses, and evaluate students' progress through courses.

The second argument won out and I spent over a year tuning our memory usage.

My current employer also hosts a learning manage system, but hosts it a bit differently. The scalability is so poor that a single installation (split across 4 load balanced virtual servers) can only handle 80 customers. Some of our larger customers even get their own set of servers. Most of the issues leading to this are performance problems, like SQL queries which hog all of the CPU cycles, memory leaks, redundant code that does the same thing multiple times. We even have an in-house app built whose sole purpose is to restart servers when they are not performing poorly. There is an employee who maintains that tool (along with other responsibilities).

They subscribed to the first school of thought I mentioned above: throw more hardware at it because hardware costs are cheaper than developer salaries.

This did not work out as economically as expected. Between hardware, software licensing, and support personnel to handle the servers, we spent millions every year to avoid having a developer spend time profiling code.

When I joined, I was made responsible for fixing our availability issues. Since most of our availability issues were due to poor performance, I have been performance tuning our code and the scalability is substantially improved, with much better uptime. We are ready to start increasing the density. Needless to say, my salary is nowhere near a million (I wish!), so spending money on having me performance tune the code is going to end up saving us millions per year.

TL;DR

If you do a thorough cost/benefit analysis, you will see that it is cheaper to just fix the code. A known performance issue that you ignore turns into technical debt.

Brandon
  • 4,555
  • 19
  • 21
  • 1
    Not every cost/benefit analysis will result in "fix the code." Programmers are very expensive, and if you can add RAM for less than the cost of a programmer and still fix the problem... – Robert Harvey Aug 18 '14 at 16:54
  • It's nice that with so much moving into cloud-hosting situations, you can see how much you're actually paying for power. For instance we use Amazon RDS for database. The difference between the largest and second-largest instance is approx. $3500 per year. That's a number that you can look at and judge whether or not it's worth a lot of programmer time to optimize. – Carson63000 Aug 19 '14 at 07:53
  • @RobertHarvey True, I should not have made an absolute out of it. What I meant to say was the the absolute "hardware is cheaper than dev time" is not absolutely true, but you're right, it could sometimes be true. – Brandon Aug 20 '14 at 00:20
-2

I understood your question like this: in order to achieve good-enough performances (i.e. users are happy and my backend doesn't cringe), do I need to understand the theory on algorithmic complexity?

It depends on what you mean by "typical" business application. In many cases, especially simple CRUD-like information systems, the answer would be no. For these you'll "simply" (sometimes it's actually hard) need to be able to trace the performances bottlenecks: did I miss an index in my database? Do I send too much data over the network? Do I have a thousand $watch in my angular.js front-end? This is about building a sound architecture, knowing your technology stack well, and avoiding non-sense. You can achieve that if you are a good crafter, not necessarily a computer scientist. Another way to say it is that the guys who built the Oracle query optimizer dealt with the algorithmic complexity stuff, you just need to use properly what they provided to you.

Now there are exceptions. If we're talking about big data or machine learning, you need to know what you are doing and not just use the default algorithms available to you. Even on the front end, if you start building advanced data visualizations some of them can imply a non linear complexity cost (e.g. force-layout charts).

Nowadays these exception are becoming more and more common and the market is pretty dry when you look for people who can handle them. So: yes, you can be successful with no computer science background, but you'll be even more with some.

-2

The other responders have covered most of the basic points, but for tasks which can be parallelized, inefficient software leads to increased hardware costs in the form of more servers, which use more power and require more space and maintenance.

Jordan Bentley
  • 579
  • 3
  • 9