4

To get that out of the way - i am not asking about which platform is faster or try to start a flame war.

Question: Why is the .NET version slower 5x in the first test ? The results i got were:

.NET Test1: 48992696 Test2: 1210070

Java Test1: 9651995 Test2: 1029374

Is this strictly behind list implementations ? If yes, is there something to make list insertion faster on .NET side ?

EDIT:

Used preallocated lists and removed int appends, got this down to

.NET Test1: 11 838 363

Java Test1: 712 481

.NET

namespace TestApp
{
    class Program
    {
        static void Main(string[] args) {
            long samples = 100;
            long test1Average = 0;
            for (int i = 0; i < samples; i++) {
                test1Average += runTest1(); 
            }
            test1Average = test1Average / samples;
            System.Diagnostics.Debug.WriteLine("TEST 1 AVERAGE: " + test1Average);

            List<String> list = new List<String>();
            for (int i = 0; i < 100000; i++)
            {
                list.Add("String: " + i);
            }

            long test2Average = 0;
            for (int i = 0; i < samples; i++)
            {
                test2Average += runTest2(list);
            }
            test2Average = test2Average / samples;
            System.Diagnostics.Debug.WriteLine("TEST 2 AVERAGE: " + test2Average);
        }

        private static long runTest1()
        {
            List<String> list = new List<String>();
            long time = System.DateTime.Now.Ticks;
            for (int i = 0; i < 100000; i++)
            {
                list.Add("String: " + i);
            }
            return (System.DateTime.Now.Ticks - time) * 100;
        }

        private static long runTest2(List<String> list)
        {
            long time = System.DateTime.Now.Ticks;
            foreach (String s in list) {
                s.ToString();
            }
            return (System.DateTime.Now.Ticks - time) * 100;
        }

    }
}

Java

public class Main {

public static void main(String[] args) {
    long samples =  100;
    long test1Average = 0;
    for (int i = 0; i < samples; i++) {
        test1Average += runTest1();
    }
    test1Average = test1Average / samples;
    System.out.println("TEST 1 AVERAGE: "+test1Average);

    List<String> list = new ArrayList<String>();
    for (int i = 0; i < 100000; i++) {
        list.add("String: " + i);
    }
    long test2Average = 0;
    for (int i = 0; i < samples; i++) {
        test2Average += runTest2(list);
    }
    test2Average = test2Average / samples;
    System.out.println("TEST 2 AVERAGE: "+test2Average);
}

private static long runTest1() {
    long time = System.nanoTime();
    List<String> list = new ArrayList<String>();
    for (int i = 0; i < 100000; i++) {
        list.add("String: " + i);
    }
    return System.nanoTime()-time;
}

private static long runTest2(List<String> list) {
    long time = System.nanoTime();
    for (String s : list) {
        s.toString();
    }
    return System.nanoTime()-time;
}

}

Mike Nakis
  • 32,003
  • 7
  • 76
  • 111
Dante
  • 1,509
  • 2
  • 14
  • 19
  • 3
    When benchmarking .NET code, don't use `DateTime` but the [`Stopwatch`](http://msdn.microsoft.com/en-us/library/system.diagnostics.stopwatch.aspx) class. – Oded Dec 31 '11 at 17:38
  • I did this back in the .NET 3.5 and Java 6 days. I found that Java was typically faster, especially after the VM had time to warm up. Highly recursive code showed a big performance gap. Of course in the end the performance is close enough we shouldn't worry too much. Nothing bad about the .NET platform. It just needs catch up time. Java has been in work for a long time before this youngster showed up. – Rig Dec 31 '11 at 19:23
  • @Rig .NET has had better performance for several years until Java caught up. Microsoft has been neglecting the CLR performance team as of late. – Rei Miyasaka Dec 31 '11 at 21:06
  • I find that hard to believe for many reasons including anecdotal evidence. You are entitled to your opinion but I saw java being about 10% faster in most cases at that point, especially after the first iteration or two of a chunk of code. The hotspot optimizations are great. .NET speed has improved a lot over the years and they might even be equals now but from that time I can produce several scenarios where .NET was playing catch up at that time. Without a lot of optimizations that C++ allows Java was only 10% slower than it in most cases too. Really, they are all fast enough. – Rig Dec 31 '11 at 21:29
  • @Rig [1](http://drdobbs.com/cpp/184401976?pgno=1) [2](http://www.tommti-systems.de/go.html?http://www.tommti-systems.de/main-Dateien/reviews/languages/benchmarks.html) [3](http://www.cherrystonesoftware.com/doc/AlgorithmicPerformance.pdf) -- C# can be seen generally doing better in older benchmarks, not so much so recently. There's been little work on C#/CLR performance since .NET 2.0 in 2005. – Rei Miyasaka Dec 31 '11 at 22:28
  • @John Nevermore: You should also be discarding the first iteration of the loop from the results, to eliminate "warming up" costs. – quentin-starin Dec 31 '11 at 22:39
  • @Rig: "Highly recursive code showed a big performance gap." - That sounds like a stretch, considering the JVM lacks tail call optimization. Maybe one particular instance of your recursive code, but that's simply not going to prove true overall. – quentin-starin Dec 31 '11 at 22:39
  • 1
    @qes Actually only the x64 CLR does tail recursion optimization -- not x86 -- which means that in practice we can't rely on tail recursion whatsoever for consumer applications unless the language itself does the optimization. Inlining in .NET has also [been really shoddy](https://connect.microsoft.com/VisualStudio/feedback/details/93858/struct-methods-should-be-inlined), and those of us doing things like game development are extremely frustrated by it, as it forces us either to make these functions members of the objects that they work on, or to pass by reference. – Rei Miyasaka Dec 31 '11 at 22:51

3 Answers3

2

In short I can't give you the answer as I certainly don't know enough about .NET, but as you can see it's usually not just a simple matter of comparing high-level Java vs C# code...

Disclaimer: There's a whole bunch of stuff I know I'm missing in my ramling below. I'm not a performance / benchmarking expert, but a few of my colleagues play one in real life :-).

It's very hard to compare Apples with Apples between the two platforms. It you really wanted to do it the hard core scientific way, you'd have to do things like:

  • Look at the underlying implementations of the data structures involved
  • See what bytecode both sets of code generated
  • Look at how the respective VMs then in-lined and ran that bytecode

I'll assume you ran both tests on the same hardware and took into account things like:

  • Allowing the VMs to 'warm up'.
  • Having similar VM settings such as GC, memory allocation and other tuning

Oh and what versions of Java and .NET and what parameters are you running the VMs under?

Martijn Verburg
  • 22,006
  • 1
  • 49
  • 81
  • 1
    Compiled against Java 7 and .NET 4.0 client profile. Java parameters are whatever netbeans sets by default. – Dante Dec 31 '11 at 18:09
2

That's an excellent question. I ran the C# test on my computer and got very similar results:

TEST 1 AVERAGE: 47,812,500

TEST 2 AVERAGE: 1,562,500

This is what I get if I replace list.Add("String: " + i); in runTest1()...

...with String s = "String: " + i;:

TEST 1 AVERAGE: 19,843,750

TEST 2 AVERAGE: 1,406,250

...with list.Add( "moo" );:

TEST 1 AVERAGE: 2,343,750

TEST 2 AVERAGE: 1,406,250

EDIT:

...with list.Add( i.ToString() );

TEST 1 AVERAGE: 20,312,500

TEST 2 AVERAGE: 1,093,750 (<-- BTW, this shows that these benchmarks are not very accurate)

...with String s = i.ToString();

TEST 1 AVERAGE: 14,531,250

TEST 2 AVERAGE: 1,406,250

...with String s = i.ToString( System.Globalization.CultureInfo.InvariantCulture );

TEST 1 AVERAGE: 13,906,250

TEST 2 AVERAGE: 1,406,250

Mike Nakis
  • 32,003
  • 7
  • 76
  • 111
  • So, it appears that the slowness is for the most part due to `"String: " + i;`, that is, due to boxing an int, converting it to string, and concatenating it with an interned string. – Mike Nakis Dec 31 '11 at 19:41
  • Which brings to mind one interesting question: what about internationalization? Could it be that in DotNet the conversion of an int to string has to consult locale conventions, while in Java things happen a lot more straightforward? – Mike Nakis Dec 31 '11 at 19:42
  • 3
    I too tried this without appending an integer to the string and the results went from 5x to 2x slower (19 .. something compared to Java's) on Test1. Makes me wonder if the other bottleneck is Java's primitives against .NET structs which inherit from Object. – Dante Dec 31 '11 at 19:44
  • **update**: the new results show that the slowness is for the most part due to `i.ToString()`. I wonder what on earth DotNet does that makes this so slow. – Mike Nakis Dec 31 '11 at 20:27
  • @MikeNakis: the int never gets boxed in that code. And the slowness of `i.ToString()` could involve it being interned. – quentin-starin Dec 31 '11 at 22:38
  • @qes Are you sure it does not get boxed? How can ToString() be called then? Plus, it makes absolutely no sense on behalf of int.ToString() to intern its results, because that would mean that counting integers and printing them on the console would constitute a memory leak. (Unless the intern pool is a cache, which is something that I do not know, but I thought it is not.) – Mike Nakis Dec 31 '11 at 22:50
  • @MikeNakis: Interning was just a guess, I don't know if it is or not, just that maybe it could be. As for boxing: http://stackoverflow.com/questions/436363/does-calling-a-method-on-a-value-type-result-in-boxing-in-net – quentin-starin Jan 01 '12 at 00:33
1

Is this strictly behind list implementations?

There are several things which are affecting the performance of those tests.

  • One is the fact that you are using DateTime. It's not intended to measure performance, and I suppose it's really slow when it comes to profiling. Since you are using it inside the loop, it makes the whole loop slower.

  • The second one is that you are adding elements through a loop to an empty list. When you are adding an element and there is no enough place, the list must expand in size. It costs resources.

If yes, is there something to make list insertion faster on .NET side?

Here is the same code with some refactoring. The most important change is that instead of creating an empty list, then adding the elements one by one, the list is created directly with all the elements inside.

Sadly, I see only 10 to 15% improvement in performance, which is still way beyond the same code running in Java. I continue to search why there is such a huge difference, but don't have a solution yet.

namespace Delme__benchmark_
{
    using System;
    using System.Collections.Generic;
    using System.Diagnostics;
    using System.Linq;

    public class Program
    {
        public static void Main()
        {
            Trace.Listeners.Add(new TextWriterTraceListener(Console.Out));

            int countSamples = 100;

            var samples = Enumerable.Range(0, countSamples).ToList();
            var list = Enumerable
                .Range(0, 100000)
                .Select(c => string.Concat("String: ", c)).ToList();

            Trace.WriteLine("Running the benchmark...");
            Trace.WriteLine(string.Format(
                "Test 1 average: {0}.",
                samples.Average(c => RunTest1())));
            Trace.WriteLine(string.Format(
                "Test 2 average: {0}.",
                samples.Average(c => RunTest2(list))));
        }

        private static long RunTest1()
        {
            Stopwatch time = Stopwatch.StartNew();

            var list = Enumerable
                .Range(0, 100000)
                .Select(c => string.Concat("String: ", c)).ToList();

            time.Stop();
            return time.ElapsedTicks;
        }

        private static long RunTest2(IEnumerable<string> list)
        {
            Stopwatch time = Stopwatch.StartNew();

            foreach (var s in list)
            {
                s.ToString();
            }

            time.Stop();
            return time.ElapsedTicks;
        }
    }
}
Arseni Mourzenko
  • 134,780
  • 31
  • 343
  • 513