In case someone cares about these things, I compared the build times and the binary sizes for 1.9 vs 1.8.3 using the open source project we maintain [1]. This is on a 6-core i7-5280K:
Build time with 1.8.3:
real 0m7.533s
user 0m36.913s
sys 0m2.856s
Build time with 1.9:
real 0m6.830s
user 0m35.082s
sys 0m2.384s
Binary size:
1.8.3 : 19929736 bytes
1.9 : 20004424 bytes
So... looks like the multi-threaded compilation indeed delivers better build times, but the binary size has increased slightly.
Unless you perform a proper statistical analysis it's unfair to draw a conclusion from a single run.
Furthermore, when I see a second run that's faster than the first one, I immediately wonder if it's the cache being cold for the first run and warm for the second.
In fairness, the phrase he used was "looks like". I don't think his comment was intended to suggest that he'd done rigorous and exhaustive wide-spectrum analysis of compile times and executable size, just that expectations matched the result for his project.
Thanks :) I'm no stranger to the scrutiny of Hacker News, I did 3 builds in a row and threw out the 1st one (cache), the last two were within 0.1s of each other, so I copied & pasted the latter.
"Programmers Need To Learn Statistics Or I Will Kill Them All"... What an insufferable asshat.
PSA: There is no reason to behave like this and this is an incredible way to alienate a bunch of people. You either offend people directly with the murder implication or they don't take you seriously because you sound like you're throwing such an extended temper tantrum that you managed to write it all in a blog.
how, concretely, should I go about doing this particular analyzis of compile time for one project ? How many times should I run the build for each of the 2 compilers and what should I do with the result so I could; 1. Draw a conclusion 2. Come up with fair numbers of how they compare ?
I would hope someone could tech this hopefully simple and very concrete thing to the HN crowd and I do hope the answer is not "go learn statistics".
You need to first create a clean slate each time for running the experiment: no cache, no FILESYSTEM cache etc. Maybe a tonne of single use docker images? Even then filesystem caches will mess you up a little.
Beyond that, you need to run the same build "several" times to see what the variance is. Without getting specific, if the builds are within a couple percent of each other, do "a few" and take the mean. If they're all over the place do "lots" and only stop once the mean stabilises. There are specific methods to define "lots" and "a few" but it's usually obvious for large effects and you don't need to worry too much about it.
If you're trying to prove that you've made a 0.1 improvement on an underlying process that is normally distributed with a stddev of, like 2, then you're going to have to run it a lot and do some maths to show when to stop and accept the result.
I want measurements with filesystem cache because I'm interested in estimating the speed of the compile-test-edit cycle. If you want to estimate the impact on emerge then you'll want no filesystem cache.
It's all about measuring based on what you intend to use the measurements for.
If the measurements are all over the place, why not take the fastest? The average is no good, because it'll be influenced by the times it wasn't running as fast as possible.
I don't myself lose much sleep over worrying about the times it runs faster than possible.
I agree with this sentiment. Any time worse than the fastest is due to noise in the system (schedulers etc). So the fastest is the lowest noise run.
Of course, as I said in another comment it depends what you want to do with the measurement. If you plan to edit how long a run will take on an existing system, then you need to accept the noise and use the mean (or median).
Personally I think it's a better idea to instrument your programs and count the number of memory (block) accesses or something. That metric might actually be useful to a reader a few years in the future. The fact that your program was running faster on a modern x86 processor from the year 2010 tells me nothing about how it would perform today, unless the difference was so large that you never needed statistical testing in the first place...
Yes, the other guys are just being pedantic because libc is attempted loaded dynamically (but it is not required—DNS behaviour just may change without it).
Well, Go code is statically linked, but the runtime may try to dynamically load libc for DNS resolving. Use of cgo of course drastically change everything.
For a few things like system DNS resolver in net package (can be switched to the pure Go version with compile-time or run-time switch) and getting user's home directory in os/user package.
At a computer with go1.4.2 freebsd/amd64 ATM (earlier was go1.8.1 linux/amd64 IIRC) and the above os/user example results in a dynamically linked ELF when built with CGO_ENABLED set to 0.
Yes indeed this is totally awesome. It’s a problem that occurs on any platform, and not only for testing. I often see this problem with logging as well, where some validation/helper function logs an error separate from the context it occurred in, potentially making it hard to trace without a stacktrace logged as well.
This is the biggest problem with Go errors and one of my biggest gripes with the language. Exceptions have stacktraces that give you context about where the error originated. Go errors don't have this and it costs me a lot of time debugging things.https://godoc.org/github.com/pkg/errors helps, but it's still more of a pain than it should be.
Except you don't get to control the specific error type returned by packages you import. So sure, you could get stack traces for your code, but not for your dependencies.
It infuriates me when go proponents try and sweep bad language decisions under the rug with half-fixes.
To be fair, the Go authors have always strongly promoted that you fork and maintain your dependencies. That may be a bad operational decision (outside of Google), but is technically unrelated to the language itself.
Mind you, if third party code needs debugging, you're going to have to fork it in order to apply your fixes in a timely manner anyway. Perhaps their stance is not as crazy as it may originally seem.
And with my professional experience, sometimes maintaining a program over a decade, I have to agree. When you are using an external library, you have to at least have a copy of the version you are using. Repositories might go away without a warning. Also, you want to go to a new version only after some review. Ideally, you don't have any local modifications - if you patch libraries, you should donate the changes back. But then, it happened that patches were rejected. In the end you are going to need a local copy/branch for several reasons.
> Also, you want to go to a new version only after some review
Happened here recently - modules used in the core build aren't versioned which means when someone external updated a module with an incompatible change, the build broke.
Moral: Never build direct from CPAN / GOPAN / CCAN / WHATEVSPAN without pinned versions.
Can you point me to a quote where Go authors have encouraged people to fork a third party library in order to insert better error handling code?
I don't want to insult anyone, but this seems like an utterly insane software engineering practice to me. It means you can't just update to the next version of that library any longer. You're taking it upon yourself to review and patch every single new version of that library. Manually.
See: Any discussion related to go get and its design motivations.
Even vendoring, which was added to the tooling later, is built on the idea of keeping your own fork. It just simplifies the forking process for people who do not work like Google does, where each fork is shared company-wide.
Not only are people here suggesting that you fork your dependencies to add stack traces to errors, which is a problem no other modern language seems to have, but it's also going to be a security disaster when some common package is found to have a vulnerability and ten percent of projects ever bother to update it.
I feel like I've entered some sort of bizarro world where everyone has forgotten that programming doesn't have to suck and pretends that none of this is a problem.
I love programming in Go but the thought of forking and maintaining every single library I might use in one of my projects makes me also feel we've entered a new, bizarre, and terrifying world. This is literally one step away from "write your own OS and compiler, it's the only way to be sure you get the exact behavior you want".
I feel like every time you point out a design flaw in go, the response that hand-waves away your concerns contains advice that's even more absurd than the problem you were originally pointing out.
Can't get stack traces from third-party errors? Maintain all of your dependencies! Tired of reimplementing common math functions? Just cast all your numerics to double! And so on...
I know the debate. It's a debate about versioning, not about maintaining your own fork in order to insert your own error handling philosophy into third party libraries.
They have always maintained that Google does modify the packages and maintains any necessary merging and maintenance of the package. It is not just about versioning. This is why they maintain their own repository of third-party libraries in the first place, and why the tooling is designed to support their forking methodology.
Again, this may be a poor operational choice for those outside of Google, but they have also been quite clear that Go is built for the problems that Google has. If your problems are different, it may not be the right tool for the job. A hammer doesn't have to be able to cut wood. There is no harm in choosing a saw instead.
You seem to be forgetting what your original claim was. It was not that Google maintains forks of some third party libraries for some reason or other. That wouldn't be surprising at all. You claimed that
>the Go authors have always strongly promoted that you fork and maintain your dependencies
And you said that in response to a need for stack traces and specific error types.
Burdening yourself with maintaining a fork of a third party library for that specific purpose is what I'm calling insane, and I don't think the Go authors have ever suggested such a thing.
Which is true. Their stance on maintaining your own forks is pervasive through all the tooling and their explanations of how the tools are designed to be used. There has been emphasis that if you want the tools to work well, you need to fork and maintain your dependencies.
You are, of course, free to do whatever you want. You can even feely let a random third-party maintain your dependencies, and hope they don't break anything, if you so wish. Many do exactly that. Despite what some people like to claim, the Go authors are not benevolent dictators that completely rule how you must develop software. They've simply published tools that worked for them, and granted a license that allows you to use and modify them in any way you see fit.
> And you said that in response to a need for stack traces and specific error types.
Technically the discussion was about how stack traces aid in debugging. It is a fair point. They most certainly do. And if you are interested in the stack within a third-party library, that suggests that you intend to debug said library.
If you are debugging a third-party package, you have already decided to fork it by virtue of what debugging requires. Adding stack traces to ease debugging of the package, if you so desire, is not a great undertaking on what is already a fork you have decided to maintain. You, of course, can submit your changes to the upstream maintainer, but in the meantime you are still holding on to a fork that you have decided to maintain. There is no guarantee that anyone else will want your changes. There is really no escaping that fact. It is not insane. It is simply pragmatic.
If you are only ever going to maintain your own code, the stack trace up to your call out to the buggy third-party package will still be available, if you choose to provide it, allowing you to work around or otherwise deal with the buggy package in your code.
>If you are debugging a third-party package, you have already decided to fork it by virtue of what debugging requires.
I think we disagree on two separate issues that shouldn't be conflated.
The first issue is whether the only purpose of a stack trace coming from a third party library is to actually debug that library. I suggest that this is not the the case.
I may simply want to understand what's going on in there. Maybe I'm passing incorrect data to that library, but the returned error is not specific enough to tell me what I did wrong. Maybe the error is somewhere further down in my own code (e.g. in a callback), in a different library altogether or related to the environment (missing file, broken database connection, ..). Or I may just want to post the stack trace in a support forum to get help.
I have often been looking at stack traces from third party libraries, but I have rarely wanted to debug those libraries.
Our second disagreement is perhaps a more gradual one. You seem to suggest that maintaining a fork is something completely normal or even desirable. I think it is something extremely undesirable - a last resort.
In any case, I do not believe that the Go creators have ever suggested that maintaining a fork is something you should do willy nilly or something that everybody should do all the time. This is different from the use of vendoring for versioning purposes. If the Go creators did in fact promote such a thing, I would strongly disagree with them.
> I have often been looking at stack traces from third party libraries, but I have rarely wanted to debug them.
If you have found stack traces are already readily available in third-party packages – that you can claim you do it often – what's the issue? Maybe this thread would be better served by real world examples of where the lack of stack traces in Go has actually bitten you?
> Maybe the error is somewhere further down in my own code (e.g. in a callback)
To be fair, the Go authors have also driven home the fact that you need to make your errors meaningful with the full context necessary to determine the path of the error (granted, not necessarily a full-fledged call stack). They have been abundantly clear that you should never simply `return err`. If this is an issue, it is because the code was written poorly.
And if a package is written poorly, and you don't want to fix the issues with it, perhaps you should reconsider using it in the first place?
> This is different from the use of vendoring for versioning purposes.
How so? The simplest form of a fork is one that is an exact copy of the original. If a package is already suitable to your needs, there is no reason to modify it, other than to perhaps merge in updates that the upstream author has made.
You can either keep that fork in a global repository, or you can keep it in vendor. There is no fundamental difference between either of those. At all. Vendor is just a convenience feature to allow the fork to stay in your application's tree.
Of course, if the library is poorly written or doesn't perfectly suit your needs you may have to make modifications to it. But that is true of every library written in every language in existence. It is not insane to do that, just pragmatic.
>If you have found stack traces are already readily available in third-party packages – that you can claim you do it often – what's the issue?
Not in Go. I have found them extremely useful as a diagnostic tool in languages that have ubiquitous stack traces.
>And if a package is written poorly, and you don't want to fix the issues with it, perhaps you should reconsider using it in the first place?
As I said, the problem may not even originate in that particular third party library. The error may just be passing through, coming from my own code or from a different library altogether.
>How so? The simplest form of a fork is one that is an exact copy of the original.
Vendoring is not primarily about forking and then changing the code. The main purpose of vendoring is freezing dependencies.
> The error may just be passing through, coming from my own code or from a different library altogether.
Then you will have a stack trace to work with. Again, this discussion would be more meaningful if you provided some real world examples. Hypothetical situations that will never happen in the real world are kind of pointless. You don't have to invent reasons to justify not using Go, if that is your intent. Frankly, I couldn't care less what tools you use, and it seems like it is justifiably not the right tool for your problems to begin with.
> Vendoring is not primarily about forking and then changing the code. The main purpose of vendoring is freezing dependencies.
But you are quite literally maintaining a fork. In 99% of the cases there will be no need to modify the code, be it in vendor or in another repository. Freezing the dependency has always been the primary goal of both approaches. The only difference vendor introduced is that the code resides under the vendor tree of each individual application, instead of in a global namespace that allows you to share your pinned version with all of your applications. That's it.
Only if a stack trace was added in the original location and then passed on at each intermediate step. As stack traces are not automatically created in Go, nor customarily passed on, there is usually no stack trace available.
>Again, this discussion would be more meaningful if you provided some real world examples. Hypothetical situations that will never happen in the real world are kind of pointless. You don't have to invent reasons to justify not using Go, if that is your intent.
I have been a real world developer for 25 years, and I have explained to you the sort of real life situations in which I have used stack traces to diagnose errors, either originating in or bubbling up through third party code.
If you want to call me a liar, fine, but otherwise don't call it a hypothetical situation that will never occur if I'm telling you that I have run into countless such situations.
I have been using Go and I have defended it on many occasions. But this sort of obstinate reaction of denial and deflection to the most obvious difficulties caused by any of Go's design decisions is really starting to put me off big time.
>But you are quite literally maintaining a fork. In 99% of the cases there will be no need to modify the code
Inserting error handling code is a modification and keeping it up-to-date with new versions of the library is the problem I'm talking about. I think I have made that very clear before.
> it infuriates me when third-party stuff in other languages throws exceptions and you end up needing to check everything anyway.
exceptions are a curse.
Anything can panic in Go. Go gives absolutely no guarantee something cannot panic. Errors as values are just a convention. So exceptions are a curse but Go has an inferior exception system, panics, but they are still exceptions.
It's fine if you think exceptions are bad, but throwing out the ability to easily determine where an error occurred (and might I add, from having seen go code in practice, having humans just become meat-based manual exception-handlers bubbling errors up to the top) is throwing the baby and the bathtub and the entire bathroom out with the bath water.
Which to be clear, is a language design mistake. Go would be a much more productive and friendly language if errors would just have stack traces on them when they're created. There's no reason to have the poor developer adding dozens of debug prints all over his (and sometimes third-party) code manually into every place that could have returned the error just to try and discern where it comes from. Or grepping for strings in the error message, or throwing darts at printouts of the source. That's just a massive step backwards from what we've had since the 80s and 90s. I know Go prides itself on being simple, but really the errors are too simple and we're all worse off because of it.
The problem is that Go errors are just a convention, they are not a feature of Go. It's just an interface, it could have been called FOO it wouldn't have made a difference. Go has exceptions, they are called "panic" by they are inferior to Java in the way they are handled.
I know they're just a convention, but that was a mistake. It should have had more first-class language support so that they have stacktraces at least.
I'm seriously beginning to think that even with the limitations, panic is a better error handling mechanism for all exceptional error cases. At least it has stacktraces and doesn't clutter the code with error boilerplate. I don't think I'd be able to convince anyone of that on a real project though, I'd have to try it on a personal project.
Nice can't wait to run some of our benchmarks against this. Go has the awesome property of always becoming a little bit faster every release. It's like your code becomes better without doing anything. Love it :)
> Type enforcement can be static, catching potential errors at compile time, or dynamic, associating type information with values at run-time and consulting them as needed to detect imminent errors, or a combination of both.
interface{} is type-checked at runtime. It's type-safe because because you can't e.g. fish out an integer out of interface{} value that represents a string. Runtime won't allow it.
You can either extract a string or the runtime will crash if you insist on extracting anything else. Unless you use unsafe package, in which case you explicitly want to skip type-safety.
When "type safe" is mentioned without qualification, it almost always refers to static type safety. This is one of those times. So no, the sync.Map container is not typesafe like a regular map is.
"Static type safety" will generally mean type safety established at compile time. In this context, "static" tends to mean "compile time" and "dynamic" tends to mean "runtime".
For example, one might say "static code analysis" to mean analyzing code without running it, such as during a phase during compilation. In contrast, "dynamic code analysis" tends to mean actually running the code and making decisions based on what happens at runtime, such as in JIT (https://en.wikipedia.org/wiki/Just-in-time_compilation) techniques that identify hotspots.
According to whom? Certainly not according to wikipedia.
Given that it's a pretty big distinction I would think it's on the speaker to be un-ambiguous and say "it's not statically type safe" vs. ambiguous "type safe".
I've certainly seen my share of people claiming that "interface{} is just like void * in C" when they speak about Go's (lack of) type safety.
I also don't see how insisting on accurate and un-ambiguous terminology ticks people off so much to downvote. I imagine they think I said something much more incorrect than I did.
"type safe" has come to mean "statically type safe" over time in common conversation. You were downvoted because this intended usage was clear to the people who downvoted you and their perception of your comment was that it provided no value, e.g. was a nitpick.
I would note that the Wikipedia page does not take as strong a position as you seem to imply, reading:
> In the context of static (compile-time) type systems, type safety usually involves (among other things) a guarantee that the eventual value of any expression will be a legitimate member of that expression's static type. The precise requirement is more subtle than this — see, for example, subtype and polymorphism for complications.
Since golang is statically typed, type safety is generally understood to mean static type safety.
Well, non-type safety isn't really worth discussing. The big differentiators these days are: does it fail at compile time or run time? This is emphatically the latter category.
I googled a little bit and found some good info, I guess I had forgotten a little bit of the concepts of mutex fairness/unfairness. I found a very nice explanation on cs.stackexchange:
"My understanding is that most popular implementations of a mutex (e.g. std::mutex in C++) do not guarantee fairness -- that is, they do not guarantee that in instances of contention, the lock will be acquired by threads in the order that they called lock(). In fact, it is even possible (although hopefully uncommon) that in cases of high contention, some of the threads waiting to acquire the mutex might never acquire it."
With that computer science clarification, I think the comment "Mutex is now more fair" and the detailed description "Unfair wait time is now limited to 1ms" makes it a lot clearer.
Great improvement I think! It's one of those things that you don't notice until you have a bug, but it's really nice to never get that bug in the first place. =)
I was curious so I downloaded Go 1.4, and tested against 1.9 on an old version of a project I have (backend for Mafia Watch), about 30K lines of Go, including a few 3rd party dependencies.
Go 1.4: Around 2.1s
Go 1.9: Around 2.5s
So within 20% of 1.4, not bad. That's on an old MacBook Air, dual core 1.7 GHz i7, 8GB ram.
And of course the binary performance and GC pause times w/ 1.9 will be much better.
Dave posted one about halfway through the release cycle, showing a small improvement. That was before the parallel-function compilation though, so things might have gotten better since then.
This has been discussed and the discussion derailed very quickly ("Go is a joke, my language has it bigger, blah blah".)
Reality is that the Linux Kernel makes a big confusion between processes and threads in the userspace APIs.
Locking to threads is a solution that works but also sucks and defeats the niceties of Go N:M model. But that's the only way: if you use that broken system calls API you should know better.
So this new concurrent map? Am I right in understanding it's designed for cases where you have a map shared between goroutines but where each goroutine essentially owns some subset of the keys in the map?
So basically it's designed for cases like 'I have N goroutines and each one owns 1/N keys'?
Also, do you have any good references to proper best practices around concurrent and parallel programming? (in Go.) Like just basic things. Code I can copy and paste without it having obscure race conditions because that use of mutex is absolutely correct, and something that lets me understand the limitations. I feel like it is very easy to do things "wrong" or not notice some edge cases. In C++ I didn't only ever coded single-threaded for this reason. Too many gotchas. Any help would be appreciated.
Necessary for reads unless no additional writing will be done. If you initialize/write into a map and then later have concurrent reads from it, the program will run. If you try to write in the midst of this, it will crash.
A concurrent map in my understanding, is a map that can be accessed concurrently without explicit synchronization, not each coroutine has a piece of it. Check java ConcurrentHashMap.
Now make that a compile time thing that happens on imports and that doesn't generate temporary source files and whoop you have modules with generics. Oops, I forgot that there are some unsolvable obstacles to be resolved.
I have been wondering for quite some time how one could implement that. That is, not in the sense of how to code that, but rather how few changes one would have to make to the language to get as many effects as possible in the direction of genericity. There is some inspiration in Lua, Scheme48 and some dialects of ML (functors, I believe?) in the form of "higher-order modules", where the module (would be package in Go?) could have parameters that you'd have to supply when importing it. The things you could obviously supply would be at least constants, functions and types. (One might look at functions as types of computational processes, though, and at function signatures/interfaces as their respective type classes. This perspective could subsume functions as types, and perhaps integers as nullary functions returning an integer.) The question is how to reasonably do the import strings. Good thing about Go is that you already have provisions in the language in the sense that the string can be technically arbitrary. A subset of the reflection interface could additionally be evaluated at compile time to provide for ad-hoc specializations of generic code by writing straightforward code that would be easily eliminated/specialized in a module instantiation (like loops over struct fields etc.)
The interface to the feature is perhaps more important than the complexity of the implementation because it will affect many more people - only a few programmers will work on the compiler but tens of thousands of programmers will be writing code using it. I make no claims as to how complex this would be to implement, but it probably wouldn't stand out. The interesting thing is that this shouldn't necessitate any changes in the language of generic modules (no <>s and such). It merely parameterizes some types and constants in a module. As such, after a certain phase in compilation, the process is the same as for a non-generic module so perhaps it's a low-complexity change in the implementation, too (not just in the language spec).
You could, but the strings are not the language proper. There's, e.g., no relational expressions in your import strings (yet!), so there's no ambiguity in parsing it. I'd actually use parentheses anyway, since type parameters are still parameters (this could give the parameter list a "Pythonic" syntax which has been shown to work well already).
I don't fully understand your comment ("no relational expressions in your import strings") but parentheses are valid characters in filenames (thus URLs). I don't think Python is the best reference here.
Build time with 1.8.3:
Build time with 1.9: Binary size: So... looks like the multi-threaded compilation indeed delivers better build times, but the binary size has increased slightly.[1] You can git-clone and try yourself: https://github.com/gravitational/teleport