Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Go 1.1beta1 now on the benchmarks game (debian.org)
30 points by jondot on April 17, 2013 | hide | past | favorite | 59 comments


I'd love to see how Go 1.1 compares against modern JITted scripting languages, because I think those will be a major part of Go's current and future competition.

Hopefully the benchmarks game will one day include those JIT variants. V8 is there, but sadly for example PyPy, HHVM (Facebook's PHP TJIT) and LuaJIT are not.


I think we're in dire need of a new benchmarks game. The author:

removed PyPy

removed the clear overview table that showed all languages, keeping only the barely visible gray bar charts with vertical text (why not horizontal bar charts, they can be scrolled and read without head tilting)

refuses to add LuaJIT and other languages

responds to most change inquiries with "publish your own version" which probably means that he doesn't really care about the project anymore

As a result, the spirit of the benchmarks game was lost a long time ago as its made less fun with each iteration.


Not only that, but "the author" has for many years provided the source code for the scripts to make the measurements -- so anyone who actually wants to can easily start a new benchmarks game.

Go do!

http://benchmarksgame.alioth.debian.org/play.php#languagex


Its far from "easily". Its not like you could just fork the code and run the benchmarks game - you would not only have to setup the environment for all languages, but also understand the not-so-well documented idiosyncrasies and deficiencies of the implementation (hardcoded paths? seriously?)

It doesn't seem like the code was originally written to be understood and modified by others and its not fully documented. As such your claim that its simple to fork it isn't really valid. (However I'm willing to admit the possibility that the configuration complexity is not easily avoidable)

I agree that its far from easy to write a benchmarks game, setup a test environment and invest oh-so-many CPU cycles into running the tests periodically. But I wouldn't go around telling people that forking the original is easy either - that would be very misleading.

Much will have to be done to improve the current engine to make it easier to fork. I'm seriously considering doing that (or alternatively making a new benchmarks game from scratch with the intent of making it easy to configure and fork)


> Its not like you could just fork the code and run the benchmarks game...

Some people download the scripts and make measurements straight-away, as always YMMV.

    1) REQUIREMENTS
     Willingness to read the README
     Willingness to write ini file name=value properties
     Willingness to (sometimes) write make commands
     Install Python 2.5+ (these are Python scripts)


> Some people download the scripts and make measurements straight-away, as always YMMV.

Without setting up the language environments? I doubt it. Some highlights:

    3) LINUX QUICK START [ WIN32 QUICK START ]

    3.1) unzip in ~ directory [ unzip in c:\ ]
Requiring a path? Very poor.

    *** TO BE CONTINUED ***
Oh so its not fully documented, but its easy to use straight away...

I'll take the bite. Lets try and add luajit and compare it to lua.

> 12.1) In the [tools] section of the ini file, define a name that will substitute for the location of the new language implementation.

Where is that INI file? The README doesn't say. Oh its my.linux.ini (its not mentioned anywhere in the document).

> For example

      [tools]
      PYTHON3 = /usr/local/src/Python-3.2/bin/python3.2
Fair enough, lets add my current luajit

    LUAJIT = /usr/local/bin/luajit

> 12.2) Choose a file extension to identify programs > and measurements made with the new language > implementation, for example - python3.

Okay, picking .luajit

> 12.3) In the [commandlines] section of the ini file, define a command line that will be used to run program source code files that have the new file extension you chose. > For example, for file extension python3

      [commandlines]

      python3 = $PYTHON3 %X %A

No comment on what %X or %A may mean - they could mean anything, really. Of course %X is the source (or built) file and %A are the arguments. Easy enough.

    luajit = $LUAJIT %X %A
> 12.4.1) EITHER alias existing source code files that have a different file extension with the new file extension, in the [alias] section of the ini file. > For example, re-use all source code files with file extension python but make measurements identified with file extension python3

    [alias]

    python = python3
So let me get this straight, to define the "python3" alias, I need to add the "python" key to the alias section and set it to "pyton3". Definitely not intuitive, I would rewrite this to be reversed.

But okay, I'll define the luajit alias

    [alias]

    lua = luajit

> 12.5) Run the bencher.py script.

Running.

    bencher release 0.9
    gtop import failed - measure cpu & elapsed time
    Wed 21:22:51 .....OK .....nbody.python [2]
    Wed 21:22:58 .OK .....regexdna.python [1]
What, only python? Oh, I need to copy the programs from the "bench" dir. Where in the readme does it say that? Nowhere.

Okay, did that, re run the script, and it runs all nbody and regexdna benchmarks, and none of the others. Why?

Going back to my.linux.ini. Could it be "onlydirs" perhaps? So I should add all the items in onlydirs, lets try that.

Okay, now its working - at least for the extensions that have a command defined. Its time to wait...

But wait, it complained about gtop? Why? Oh, it uses an old way of importing GTop, the new way is to use the introspected bindings

    from gi.repository import GTop
lets patch uselinux.py

    Wed 21:55:42 .EMPTY pidigits.luajit
of course, I had to remove everything in tmp/* before re-running the test, otherwise it thinks there is "nothing to be done".

Okay, that didn't quite work. Lets forget about gtop for now and look at the results we got without it

> 6) LOOKING AT THE MEASUREMENTS > ~/bencher/summary/all_measurements.csv > Open all_measurements.csv with a text editor.

No summary dir here. Well shoot, what does the README say about a missing summary dir? Nothing.

Oh, its in ~/bencher/tmp/all_measurements.csv So what do the times say?

binarytrees,luajit,3,0,633,0.004,0,0,%,0.010 binarytrees,lua,2,0,525,0.000,0,-2,%,0.005 binarytrees,lua,3,0,633,0.000,0,-2,%,0.007 binarytrees,luajit,2,0,525,0.000,0,0,%,0.003

Lets compare that with the summary data:

binary-trees,Lua,2,12,446,0.796,4516,0,0% 0% 1% 100%,0.801

So uhh, I have to configure the parameters for every benchmark. Why don't you share your configuration? Oh, I see, there are various paths defined there appropriate for your environment.

Wait what? Why don't you separate the test configuration from the environment configuration?

First conclusion:

The bencher needs more work to make it easier to fork:

* improve readme by actually mentioning which ini file users should look at, telling them about the tmp dir, explaining the value syntax for the commandlines section etc. (important)

* invert entries in the [alias] section, because the way its set up now it makes no sense. (not that important)

* separate test and environment configuration in different files and share the test configuration files used on the website to help people reproduce the same results. (very important)

* fix the script to use the introspection bindings for gtop

Second conclusion:

The claim that its "easy" to set up has been nullified.


>>Without setting up the language environments? I doubt it.<<

You seem to have run nbody.python without setting up the language environment for it.

I'm going to ignore all the rest of your editorializing and try to find something of substance.

Let's just note that you've jumped from #3.1 to #12.1 -- ignoring #3.2 which checks for problems and the subsequent sections that work through those problems and explain some of what you later find so puzzling.

>>Where is that INI file? The README doesn't say. Oh its my.linux.ini (its not mentioned anywhere in the document).<<

You seem to have found the ini file.

>>No comment on what %X or %A may mean<<

Actual comments on what %X or %A may mean:

    ; %X %T %B %I %A in commandlines are replaced like this:
    ;
    ; nbody.python-4.python %X = nbody.python-4.python
    ; nbody.python-4.python %T = nbody
    ; nbody.python-4.python %B = nbody.python-4
    ; nbody.python-4.python %I = 4
    ;
    ; %A = [testrange] value or 0 when the program takes input from stdin
>>What, only python? Oh, I need to copy the programs from the "bench" dir. Where in the readme does it say that? Nowhere.<<

You don't seem to have read sections #3 through #11.

>>Okay, did that, re run the script, and it runs all nbody and regexdna benchmarks, and none of the others. Why?<<

You don't seem to have read sections #3 through #11.

>>Oh, it uses an old way of importing GTop<<

That actually is worth updating the readme about!

>>of course, I had to remove everything in tmp/* before re-running the test, otherwise it thinks there is "nothing to be done".<<

No you didn't. You don't seem to have read sections #3 through #11.

>>No summary dir here.<<

That actually was a bug! Empty directories weren't included in the snapshot zip.

>>improve readme by...<<

You didn't seem to read sections #3 through #11 (very important)


No, you didn't seem to read my suggestions at all.

>>What, only python? Oh, I need to copy the programs from the "bench" dir. Where in the readme does it say that? Nowhere.<<

> You don't seem to have read sections #3 through #11.

If you grep the file, there is no mention of the bench dir anywhere in sections 3-11. Please do tell where you found it. The readme does say how to add new programs, but it doesn't say where the original programs of the shootout are in the distribution.

>> of course, I had to remove everything in tmp/* before re-running the test, otherwise it thinks there is "nothing to be done".<<

> No you didn't. You don't seem to have read sections #3 through #11.

No, you didn't read what I was doing. I had to remove everything in tmp/* because I was trying to enable gtop by editing the source code of the program. The old results were made without gtop and only contained cpu and elapsed time.

Finally the "actual comments" on what variables are available for the commandline strings aren't written anywhere. Only "%A" is explained in 9.3 and "%I" isn't even mentioned.

So it looks like all my suggestions remain valid:

From the first suggetion, removing the "tmp" dir assuming that the summary dir bug is fixed in the future:

1) improve readme by mentioning the location of my.linux.ini and my.win32.ini and adding the complete syntax for the commandlines section etc. (important)

2) invert entries in the [alias] section, because the way its set up now it makes no sense. (not important)

3) separate test and environment configuration in different files and share the test configuration files used on the website to help people reproduce the same results. (very important)

4) fix the script to use the introspection bindings for gtop

Note that you failed to comment on (3) which will by far improve the bencher the most. Infact, that is exactly where I stopped trying to reproduce your results - I stopped at the point where I had to extract all the values for the testrange from your CSV files, as the ones used on the website aren't distributed anywhere.

But of course, you're free to continue ignoring useful comments from other people. Which brings me to the original point, we're in dire need of a new benchmarks game.


>>The readme does say how to add new programs, but it doesn't say where the original programs of the shootout are in the distribution.<<

You seem to have found the benchmarks game "original programs" in the project tarball.

People measure their own programs with bencher, not the benchmarks game programs -- it's not dependent on the benchmarks game programs.

>>I had to remove everything in tmp/* because...<<

    #5.3 #5.4
>>what variables are available for the commandline strings aren't written anywhere<<

    my.linux.ini [commandlines]
>>testrange from your CSV files, as the ones used on the website aren't distributed anywhere<<

    nanobench/makefiles/u64q.ini
    nanobench/makefiles/u64.ini
    nanobench/makefiles/u32q.ini
    nanobench/makefiles/u32.ini
>>you're free to continue ignoring useful comments<<

You're comments are useful because they show what someone might become confused about.


> You seem to have found the benchmarks game "original programs" in the project tarball.

> People measure their own programs with bencher, not the benchmarks game programs -- it's not dependent on the benchmarks game programs.

It is not, but if you sincerely want to make it easy for people to reproduce all or parts of the benchmark, you should at least document where they can find all the settings of the original.

>>I had to remove everything in tmp/* because...<<

    #5.3 #5.4
Fair enough, I missed those two points in the README. A weaker point still remains though: its not very user friendly or obvious, like say a --force flag (or something in the spirit of make's --always-make).

These things may not seem important, but I think they are when distributing software with the intent of someone else running it.

>>testrange from your CSV files, as the ones used on the website aren't distributed anywhere<<

    nanobench/makefiles/u64q.ini
    nanobench/makefiles/u64.ini
    nanobench/makefiles/u32q.ini
    nanobench/makefiles/u32.ini

Ah in nanobench. Well that was definitely not easy or obvious, was it? My fault for not using find on the directory.

And how about that suggestion of splitting the configuration file to two separate files: program configuration and environment configuration? It would make it easier to run the benchmarks on the same set of programs with the same parameters, except in a differently configured environment. (the set of language implementations and their build and run commands would be a part of the environment configuration)

> You're comments are useful because they show what someone might become confused about.

That was exactly my point - to prove that your "easily" claim is a stretch. The README does not cover everything, especially not the parts needed for someone to bench the programs found on the website. Which is understandable since its a README for the bencher program.

Perhaps a separate README for the entire archive (documenting where the original configurations and programs are) will fix most of these confusion issues. More descriptive directory names would also help (e.g. "game-programs" and "game-configurations")

And if I seem so negative, its because I used to love the benchmarks game and felt that every change you've made lately significantly subtracted from the game's fun factor of it while not significantly adding anything.

For example, that recent removal of the (not very useful, but still extremely fun!) combined language comparison table. Why remove things? Its sufficient to simply warn the user. If they ignore the warning and take those numbers seriously, that is their own fault. Why should they be ruining the fun for everyone else?

Oh well...


>>to prove that your "easily" claim is a stretch<<

Yes, you were trying to prove that -- you were finding fault instead of finding how to make things work.

Other people found how to make things work, because that's what they were trying to do.

>>a separate README for the entire archive<<

Already done.

>>Its sufficient to simply warn the user.<<

Was it sufficient to list

    Willingness to read the README
as a requirement? Apparently not.


>>to prove that your "easily" claim is a stretch<<

> Yes, you were trying to prove that -- you were finding fault instead of finding how to make things work.

> Other people found how to make things work, because that's what they were trying to do.

Not really. The first time I downloaded the zip (about a year ago) I sincerely tried to make it work and did read the readme. Spent about 1 hour then gave up. This time I simply retraced those exact same steps with the intent of showing the current deficiencies.

>>Its sufficient to simply warn the user.<<

> Was it sufficient to list

> Willingness to read the README

> as a requirement? Apparently not.

Yes it is. Its my fault for not reading the README completely. (but to be fair, all directories of the distribution were not quite covered)

You don't have to remove the complete zip archive then add a quiz to the website testing if the user read the README before allowing them to download the complete archive.

If that doesn't work either, what then? Will you completely remove the download link to the archive, ruining it for those willing to read the README? The way I see it, you've done your part, the rest is up to the user.

This is where I strongly disagree with your approach - and this is why I want to make a new benchmarks game.


> I downloaded the zip (about a year ago)

Do you mean the project tarball or do you mean the bencher zip?

> I sincerely tried to make it work

Did you ask for help?

> This time I simply retraced those exact same steps

This time you've told everyone such-and-such isn't in the README when it is; such-and-such aren't written anywhere when they are; such-and-such aren't distributed anywhere when they are.

None of that stuff changed in the last year.

>>and this is why I want to make a new benchmarks game<<

So get the tarball tomorrow, read the README, find out what doesn't work for you and fix-it (if you're missing python-gtop install it), make measurements and publish them. Easy.


> This time you've told everyone such-and-such isn't in the README when it is

I clearly demonstrated all the deficiencies of the current distribution, especially in regards to running the same programs (with the same arguments) as those on the website.

your explanation of the commandline variables was missing from the readme

reference to the actual location of the ini file (my.linux.ini). I expect a section like this:

    The bencher is configured with an INI file. 
    There are two example ini files included with the
    distribution: my.linux.ini and my.win32.ini, 
    located in the "makefiles" directory.
this can't be found anywhere in the distribution:

    The ini files from the game website can be found
    in the nanobench/makefiles directory:
    
    <listing of the files>
Then I will agree that the documentation is complete, and that the process to run your own benchmarks is almost "easy".

If you separate the configuration files, organize the directories more appropriately and write a more condensed README that skips the condescending act towards the user, then I will agree that running your own benchmarks is easy.

> Did you ask for help?

I wouldn't have had to ask for help if the documentation was complete and adequate.

I will however check out the tarball tomorrow.

Oh and what about the suggestion to separate the environment configuration from the programs configuration? No comment, I guess...


>>your explanation of the commandline variables was missing from the readme<<

The explanation is not missing.

The explanation is where you need to see it -- where you write the commandline variables in the ini file.

>>I expect...<<

Speaking about your expectations would have been better than stating something is "not mentioned anywhere in the document".

>>I wouldn't have had to ask for help...<<

I guess you didn't ask for help.

>>what about the suggestion to...<<

What about making and publishing measurements?


More corrections--

>>3.1) unzip in ~ directory [ unzip in c:\ ] Requiring a path? Very poor.<<

Unzip somewhere else and see that it works just fine.

>>Where is that INI file? The README doesn't say. Oh its my.linux.ini (its not mentioned anywhere in the document).<<

#4.2

>>Oh, it uses an old way of importing GTop, the new way is to use the introspected bindings from gi.repository import GTop lets patch uselinux.py<<

Let's install the required python-gtop bindings

    http://packages.ubuntu.com/precise/python-gtop
    dpkg -i python-gtop_2.32.0+dfsg-1_amd64.deb


Why ask people to start their own benchmarks (presumably with Blackjack and hookers) instead of allowing people to contribute to the Benchmark Games?


Because they don't wish to contribute to the benchmarks game, they wish to see something different -- and that's fine with me.


"If you're interested in something not shown then please take the program source code and the measurement scripts and publish your own measurements."

http://benchmarksgame.alioth.debian.org/play.php#languagex


Yeah, but the request isn't to include some random language, it's to use the fast implementations of any given language. Why wouldn't you, in a site that benchmarks speed?

I think not to do so is unhelpful, actually -- if you don't realize they're are JITs for Python and Lua that aren't being included in the benchmarks, you will come away with a completely wrong impression about the possible performance of those languages.

edit: what spion said in their comment


Because there was a big spitting match a couple of years ago and it seems the maintainer picked its favorite languages and features and chose to represent (what seems like on purpose) others in a bad light.


    For a couple of years I've wanted to "cull the herd" but
    my curiosity (and interest in promoting experimental
    language implementations) stopped me doing so.

    The most that Alex Gaynor's nonsense did was prompt me
    once more to consider whether the time was ripe.

    Apr 2011
http://www.reddit.com/r/programming/comments/glvgk/lua_jit_p...


>>Why would you not do that...<<

"Because I want to do fewer chores not more!"

>>a completely wrong impression<<

    "These are not the only compilers and interpreters.
     These are not the only programs that could be written.
     These are not the only tasks that could be solved.
     These are just 10 tiny examples."


Let's say I make a website called the Prettiest Person Game, and I throw up some professional crazy-awesome shots of myself and my buddies, and I also put up all igouy's old driver's license photos, and then create a whole bunch of different ways we can compare these photos and see who's prettier.

And let's say this site becomes really popular. And people start posting articles like, "wow, dilap is like 10x prettier than igouy in ever single way!"

And then igouy's friends go, "whoaaaa, hold on a minute, those are terrible photo's of ioguy! he can really look much better than that!"

And then I'm like, "Hey, I ain't got time for this -- I told you already these are not the only photos in the world. Go take your own photos if you want a different comparison!"

That's what you're saying.


What I'm saying is --

    tl;dr

    Measurement is highly specific -- the time taken for
    this benchmark task, by this program, with this
    programming language implementation, with these
    options, on this computer, with these workloads.
http://benchmarksgame.alioth.debian.org/


In terms of your little make-believe story it's -- "send photos of igouy and you and your buddies, and we'll see which photos are prettier".

But don't send photos of your cousins, and your buddies cousins, and...


I fondly remember The Great Language Shootout by Doug Bagley; I could understand the majority of the problems, there were a lot of languages represented. Good times.



I believe Go is compiled [1][2]. Isn't a comparison with something like Haskell [3] more fair?

[1] http://golang.org/doc/faq#Implementation

[2] http://golang.org/cmd/go/#hdr-Compile_and_run_Go_program

[3] http://benchmarksgame.alioth.debian.org/u64/benchmark.php?te...


The Glorious Glasgow Haskell Compilation System is truly glorious


That was there. PyPy was there at least. LuaJIT too I think. All go yanked out. The benchmarks are bogus anyway unless what you plan on working on n-body or fasta search or a short other list of things they run.


It would be nice to compare it to the old version. At any rate, it seems like it's a lot more competitive with Java in performance and it uses a lot less memory:

http://benchmarksgame.alioth.debian.org/u64/benchmark.php?te...

It's certainly much faster than the fastest "scripting" language:

http://benchmarksgame.alioth.debian.org/u64/benchmark.php?te...


>>It would be nice to compare it to the old version.<<

12 days ago

https://news.ycombinator.com/item?id=5494113


I dunno about you but I don't use Go where I could use javascript. And also, I use multicore machines only.

So I prefer this URL: http://benchmarksgame.alioth.debian.org/u64q/benchmark.php?t...

:P


Go vs. v8 is a reasonable comparison. Node.js is probably the closest substitute for the way Go is most often used today (as API servers or backend app components for web applications).

I have no comment as to whether this "benchmarks game" stuff tells us anything meaningful.


... That's using Go 1.0.3.


Why are some benchmarks using Go 1.0.3 and some 1.1beta1? And why is the PCRE program broken? http://benchmarksgame.alioth.debian.org/u64/program.php?test...

As far as I understand it just needs a "go get github.com/glenn-brown/golang-pkg-pcre/src/pkg/pcre"



Thanks.


you prefer to do multi-thread programming in C? serious?


Why the spectacular loss to V8 in the Regex tests? Is Go's regex engine a naive implementation as of now?


Russ Cox, in response to a regex benchmark vs. Python/Ruby.

"You assume the benchmark is worth something.

First of all, Ruby and Python are using C implementations of the regexp search, so Go is being beat by C, not by Ruby.

Second, Go is using a different algorithm for regexp matching than the C implementations in those other languages. The algorithm Go uses guarantees to complete in time that is linear in the length of the input. The algorithm that Ruby/Python/etc are using can take time exponential in the length of the input, although on trivial cases it typically runs quite fast."[1]

[1] https://groups.google.com/d/msg/golang-nuts/6d1maef4jQ8/yg4N...


As a member of the masses of programmers that typically deal with truckloads of trivial problems, his reply is worth as much as the benchmark he criticizes. I've never used anything but a trivial regexp (for some values of trivial) in production code (one-shot problems don't count) and I don't give a damn what is being benchmarked here - I only care if my Python script runs fast.


Actually, I might agree with Go here. I agree that most of my regexes are the trivial kind that don't exhibit exponential growth, but that's only because I also control the input to all those regexes. But Go's intended niche appears to be for user-facing server-side programs, and in that environment if you ever run a naive exponential-time regex against user-supplied input you're setting yourself up for a potential DOS attack.


But Go's intended niche appears to be for user-facing server-side programs, and in that environment if you ever run a naive exponential-time regex against user-supplied input you're setting yourself up for a potential DOS attack.

Yes, but for larger recognizers, one would probably use something like Ragel anyway. A DSL that provides intersection, union, difference, concatenation, composition, etc. makes it far easier to construct elaborate automata from smaller expression than the 'compile a big regex string' approach.


Why not use the potentially exponential algorithm initially and timeout and fall back to the linear case if the sometimes-quick one has taken longer than the linear one would have? At most a 2x slowdown for complex cases, and substantial speedup for the simple ones.


It depends, if you have the opportunity to build a DFA once, why not do it? It's only a one-time cost.

I think the more serious problem is that what most people consider to be a 'regular expression' does not express a regular language, and hence cannot be expressed as a finite state automaton.



I think his answer is useful. It does explain what's going on. If the performance characteristics of PCRE suit your problem better it's trivial to write a cgo wrapper for it. There's even one on github, though I don't know if it works.


It works (and will be updated across the website once Go1.1 is released).

http://benchmarksgame.alioth.debian.org/u64q/benchmark.php?t...


>>a different algorithm for regexp matching<<

Here's a C++ program that uses RE2

http://benchmarksgame.alioth.debian.org/u64/program.php?test...


V8 JIT-compiles regular expressions.

EDIT: Since apparently someone doesn't believe me: http://blog.chromium.org/2009/02/irregexp-google-chromes-new... ("After optimization we generate native machine code which uses backtracking to try different alternatives.") See also the yarr JIT, which jsc and SpiderMonkey use for regular expressions: http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/y....


Looks like there's a reimplementation of RE2 in Go that might we worth benchmarking against. https://code.google.com/p/sre2/


So what approach is implemented in the Go distribution?


The same as re2 (and, in fact, written by the same authors as re2)


So why is the C++ regex-dna program that uses RE2 so much faster?

http://benchmarksgame.alioth.debian.org/u64/program.php?test...

And why did someone do "a reimplementation of RE2 in Go" ?


I'd be interested in seeing compile times. In a talk posted recently Rob Pike seemed to be kind of emphasizing #include inefficiencies and bytes read by compiler to bytes of source input ratios for C/C++, and how was better at this.


Isn't comparing against a scripting language apples and oranges? Compiled vs interpreted one would assume to always look like this. Although the modern sophisticated JITs have definitely narrowed the playing field.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: