11

I'm thinking of scientific applications that are mostly processor-bound and heavy on heap usage (at least several gigabytes). Any other time of the year I would happily go with C++, but in this case I wonder if the fragmentation natural to the C++ memory manager can be a serious issue versus the advantage of Java's compacting collectors.

Can anybody point to real-world examples related to this?

PersonalNexus
  • 2,989
  • 5
  • 27
  • 42
dsign
  • 277
  • 2
  • 15
  • I don't think the language is as important as the programming in this case. Any mature language could probably work depending on the scale of your computation. There are Javas, C/C++, even Pythons and Rubies that can slide into this role. Some would be harder than others as it sounds like you really need to have a firm assurance you aren't leaking memory. – Rig Jan 14 '12 at 15:08
  • 2
    If you can get a GB for $7.99, is that a problem? [Kingston 1GB DDR3](http://www.newegg.com/Product/Product.aspx?Item=N82E16820134787&nm_mc=AFC-C8Junction&cm_mmc=AFC-C8Junction-_-Memory%20(Desktop%20Memory)-_-Kingston%20Technology%20Corp.-_-20134787&AID=10440897&PID=2806964&SID=) – Bo Persson Jan 14 '12 at 15:09
  • 2
    @BoPersson, in my experience, people having those kind of problems start by fully populating their high end motherboard and complain that they can't put there as much as they would like, and then feed data set as big as manageable and then complain that it isn't enough. – AProgrammer Jan 14 '12 at 16:14
  • @dsign, in these days where you can get motherboard accepting several hundreds of gigs of memory for less than 1000€, a few gig isn't heavy in memory usage. – AProgrammer Jan 14 '12 at 16:21
  • 1
    Agree with the cheap-memory part. As for development, I have been in C++ for a while and good coding practices make leakages a rather infrequent phenomenon; as a matter of fact, I really prefer C++ over java in that regard. – dsign Jan 14 '12 at 16:52
  • What is your criteria for determining what is best? –  Jan 14 '12 at 17:07
  • Except for that C++ and Java can equally well consume all the RAM you have. –  Jan 16 '12 at 18:32

3 Answers3

11

If you are talking about an application which is bound to stress the limits of the machine, such that you expect you will be doing programming tricks to avoid exceeding those limits, then C++ is the way to go. Not only C++ gives you room for optimization where Java does not, (as Emilio pointed out,) but also, Garbage-Collectors are very memory hungry contraptions which need plenty of extra free memory in order to work efficiently.

The answers to this question: StackOverflow: How much extra memory does garbage collection require? paint a rather grim picture, but even if garbage collectors need the free memory to be only about as much as the allocated memory, (which is what I had heard,) this still means that with Java you will still need a lot of free memory for it to run efficiently.

On the other hand, nowadays we generally prefer to buy more expensive hardware rather than to have to perform programming tricks to avoid exceeding the limits of the hardware. In your case, your RAM problems would typically be solved by using a 64-bit machine and throwing as many RAM modules on it as necessary. You see, the cost of hardware is nowhere near the cost of development time in the developed world nowadays.

I think that you should seriously consider this option, and if possible, go with this option and with Java instead of C++, because it is a lot easier to develop something in Java than in C++, and to keep maintaining it afterwards.

Mike Nakis
  • 32,003
  • 7
  • 76
  • 111
  • Thanks for your answer. I agree with your hardware illustration. – dsign Jan 14 '12 at 16:48
  • If you do not care about the program pausing while garbage collection runs, the memory needs are much smaller. –  Jan 14 '12 at 17:08
  • 1
    I don't agree with the last paragraph. Java is not necessarily easier to develop in than C++. It wouldn't be for me, since I've done lots of C++ and relatively little Java in the past five or six years. It's possible to write maintainable and unmaintainable code both in C++ and Java. – David Thornley Jan 16 '12 at 21:28
  • 1
    @DavidThornley I have done C/C++ for 10+ years, and Java for 6+ years. I find Java to be easier on all counts: prototyping, developing, extending, and maintaining. But in any case, that's what opinions are meant to do: _differ_. C-:= – Mike Nakis Jan 16 '12 at 22:04
  • Have any of you programmed for a big-data, processor-hungry project? Any comments there? – dsign Jan 17 '12 at 07:56
  • I have done a relatively big-data, very processor-hungry project both in C++ and in C#. (Which, for the purposes of this discussion, can be thought of as an equivalent of Java.) It was a crossword puzzle creator. (It would not qualify as big-data today, but it did back when I was working on it, when half a gigabyte of RAM was considered to be a lot of RAM.) I cannot say how the two versions compared with respect to RAM consumption, because I did not test for that. – Mike Nakis Jan 17 '12 at 11:21
  • But I can tell you that once I wrote it in C++, changing anything in it was such a major pain, that I refrained from ever touching it again. While once I wrote it in C#, modifications were so easy, that I was able to step by step add several levels of algorithmic optimizations to it which made it an estimated 1000 times faster. How about that. – Mike Nakis Jan 17 '12 at 11:22
  • @MikeNakis Thanks Mike. Though it is more likely that I will have rather simple code, just big data. And having been with C++ for eleven years, I sort of like it ;-) – dsign Jan 31 '12 at 14:16
  • I have programmed a very memory- and CPU-hungry project (expert system/massive DB with strict response time requirements). Using C++, some asm optimisations, and a redesign of the data model, I achieved a speedup of 5 orders of magnitude over previous attempts. As much as I like Java for some applications, that simply wouldn't have been possible with it or anything that has mandatory GC. The memory control over constructors and destructors give you becomes *very* important in such "extreme" cases. – foo Mar 31 '14 at 18:07
7

The problem is not use C++ as it is Java and not use Java as it is C++. A C++ container is normally implemented to avoid excess of fragmentation, just like the Java free store do.

But if you allocate yourself the memory directly, you can do also things Java does not permit you to do, that can result in fragmentation.

The proper solution (in C++) is use container and smart-pointers through allocator classes that manages allocation by means of fixed "plexes" (the key point, here, is writing a good allocator class). And this is a programming style that has nothing to do with Java, so any comparison is meaningless.

[EDIT] This can be an outdated sample: Fixed allocation

Emilio Garavaglia
  • 4,289
  • 1
  • 22
  • 23
  • Thanks for your answer Emilio. I'm well aware of C++'s flexibility and feel inclined to agree with you. Then again, I would like to know some real-wold usage examples where these techniques have been used to success. – dsign Jan 14 '12 at 09:25
  • @dsign: Post edited, see the link – Emilio Garavaglia Jan 16 '12 at 18:24
  • 1
    I used these techniques previously. A poorly performing app had its allocator changed to use a set of fixed-block heaps. So when you wanted 4 byte block, it came from the heap that only stored 4-byte blocks. If you wanted 5 bytes, it came from the 8-byte block heap, etc. The performance increase (for our heavily allocating app) was tremendous. You might not get as good a result, but it can be very effective. There was also zero fragmentation as all allocs came in fixed blocks. – gbjbaanb Jan 16 '12 at 23:01
2

The advantage of garbage collection is that it simulates a machine with an infinite amount of memory. The mechanism or implementation of that abstraction is intended to be completely transparent to you as the programmer. We all know that the mechanism is reclaiming memory that is no longer used by the program, but that's not actually guaranteed. If you run the program on a machine with more RAM than the program ever actually uses, then garbage collection may never happen. Again, irrelevant, because you can just write the program without regard to how it uses memory. The memory manager will just allocate more RAM whenever the program requests it, and you're allowed to assume that such allocations will always succeed. Java is a garbage-collected language, and C++ isn't.1

The disadvantage of garbage collection is that, like all abstractions, it tends to be leaky. It doesn't always work perfectly all of the time, particularly in edge cases, and you're likely to run into bugs. The people who wrote the garbage collection algorithm (the one that's supposed to be transparent to you as a programmer) optimized for the most common cases, and the trouble with common cases is that they're never all that common. In general, you can't do any better than the garbage collector can at managing memory. But in specific circumstances (and given a sufficient amount of time, energy, and understanding), it might be possible. C++ gives you this flexibility; Java doesn't.

All of that said, I think the standard advice for choosing a language applies here, perhaps even more so in this case given the constraints. Pick the language that is the most familiar to the primary developers for the project. In addition to the obvious reasons (like you'll be able to develop the app faster and more efficiently), this is particularly important in the case that you describe because programming C++ like you're programming Java is going to result in terribly ineffective memory management practices, and therefore leaks and crashes. Analogously, programming in Java like you're programming in C++ isn't going to do you much good, and may end up producing a less-than-optimized program, given that the garbage collection algorithms are tweaked and tuned for the most common cases.

Programmers that are used to working in garbage-collected languages learn to trust the garbage collector, rather than fighting against it. If you're working in a garbage-collected language, these are the programmers that you want on your project. Programmers who are not used to working in a garbage-collected language are inherently skeptical of such an "infinite memory" abstraction, and frequently with lots of good reasons. Good as these programmers may be, these are not the ones you want working in a garbage-collected language because they'll be fighting against the GC every step of the way, constantly second-guessing it and often producing slower, less memory-efficient code than the other type of programmer. At best, they'll just spend a lot of time reinventing the wheel, costing you a lot of money and even more in long-term maintenance costs.

And then you also need to ask yourself whether it really matters. There's more than a hint of truth to Bo's snide comment: memory is so cheap now, it's hardly worth too much hand-wringing. Even if you need massive amounts, those amounts aren't nearly as massive now as they were 10 years ago. Programmers and application development are far more expensive than just buying gobs of RAM and processing power. That doesn't mean that you should eschew economy where possible, but it does mean that you shouldn't waste too much time doing it, either.


1 Of course, this assumption highlights a deeper flaw in the question. As it turns out, "Java or C++" is a bit of a red herring. The standard Java implementation provides garbage collection and C++ doesn't per the language standard, but there's absolutely no reason that you couldn't use a third-party garbage collector for C++. Lots of companies have made a living selling these things, and some have probably made a living giving them away for free.

Cody Gray - on strike
  • 1,737
  • 2
  • 19
  • 22