We haven't spent much time with those benchmarks. We looked at a couple of them and believe that there are better ways to write them in C# and get better results. That's not FUD but our findings.
This is a great community activity. Clearly, the community is more than capable of performance enhancements, based on the improvements they have made in the product.
If people start improving the C# benchmarks, please file an issue on dotnet/core to get feedback and some cred. We may do another blog post on that if there is some gravity around the activity.
> Please don't implement your own custom "arena" or "memory pool" or "free list" - they will not be accepted.
> ...
> We ask that contributed programs not only give the correct result, but also use the same algorithm to calculate that result.
So there might not be too much room to improve. There could be some room to improve for things like "custom ... memory pool" since .NET Core has ArrayPool [2] built-in. But I can't tell if the spirit of that rule is "don't implement pooling" vs. "you can only allocate memory in the standard ways provided by the runtime."
May come down mostly to Java being able to use 32-bit references on a 64-bit JVM in certain cases. Many of those programs are very heavy on references. C# can gain a bit in some cases due to value types, but it doesn't always help enough.
I played around with the binary-tree code. It's definitely because of Task (disabling the >= 17 heuristic doubles the memory usage). This should come way down with .Net Core 2.0 and ValueTask.
http://benchmarksgame.alioth.debian.org/u64q/csharp.html