Hacker News new | past | comments | ask | show | jobs | submit login
Go 1.3+ Compiler Overhaul (docs.google.com)
210 points by geetarista on Dec 19, 2013 | hide | past | favorite | 172 comments



While I agree with the end goal here, isn't 60k lines of C actually a rather small program to attempt to port in a one shot effort? There will also be an inevitable clean up phase since the Go generated is less than likely to be idiomatic.

For me the really fun bit is getting to having the whole runtime in Go, which would be a situation not dissimilar from classic Smalltalk environments - especially writing the GC in the target language.


60k of very subtle, complex, intertwined C code is actually quite a lot.

Porting the runtime to Go is also on the drawing board, but later. Right now the compiler complexity makes it harder to work on, hence porting it first is going to pay off quicker.


Not saying this is all that matters, but if you have a verifiable c->go translator (that is to say, a well understood and tested translator known ("proven") to give you an output program that behaves identically to the input program), then you may be able to skip (or more likely alleviate) the awkward phase of bug introduction when translating a program. I think a problem like this is one of the cases where its easier to reason about in the abstract ("Clearly all arrays should be converted in this fashion"), vs the concrete ("in this part of the code we should do this with the array"), and also avoids the inevitable temptation of "fixing" or "cleaning things up" during the translation. Also of course there is a huge incentive for a good c->go translator to exist for the go community.


> which would be a situation not dissimilar from classic Smalltalk environments - especially writing the GC in the target language.

Like the Jikes RVM? IBM Research did some really cool stuff. Anyways, anything is possible when you go meta.


I am looking forward to see Hotspot replaced by Graal, post Java 8.

Lets see if it really happens.


For comparison, the original Rust compiler was written in OCaml. AIUI, the Rust compiler build now has three stages: a blessed binary snapshot of Rust+LLVM (stage0) compiles and tests a new build (stage1) from source, then (as a bootstrap sanity check) the stage1 compiles and tests a new build (stage2) from the same sources. Needless to say, this is a time-consuming process. :)


Does it mean that you have to change ocaml sources to change the language? Why don't you develop Rust in Rust? That would be the real eat your own dogfood?


It is now purely in Rust (with a few thousand lines of C, mostly external libraries or tiny wrappers to interact with the OS, where the API is defined in terms of C macros) + the LLVM optimiser.

The OCaml compiler was just written to get an implementation of the language that was good enough in which to write a compiler. The OCaml source was deleted more than 2.5 years ago and Rust has been properly bootstrapping/dogfooding for that long.


So the current state is: Rust is developed in Rust. Go is still developed in C, but they plan to gradually change that.

So it's Go which still doesn't eat its own dogfood. I always wondered why Go was never functional enough to be used in a lot of the scenarios where C is used. With all the details, it's more clear now.


The whole Go standard library, which is more code than just a compiler (and it even includes a complete Go AST parser as well), is developed in Go ("dogfooded") and so are the documentation tools (godoc, the web site, present) and the static code analysis tools (SSA, oracle, go/types etc.) in the go.tools repository.


Is there a place where I can read the source of the OCaml Rust compiler?


All 38k lines were deleted in https://github.com/mozilla/rust/commit/6997adf76342b7a6fe03c... , so the most recent revision is https://github.com/mozilla/rust/tree/ef75860a0a72f79f97216f8...

(The "Rust" that the OCaml compiler handled is very very different to modern Rust, fwiw.)


It's in the boot/ directory here if you go back far enough in time: https://github.com/mozilla/rust

Sadly the GitHub Web UI seems to cap viewable history at 100 pages' worth of commits.


This seems like definitely the right decision. If the C# compiler had been written in C# from nearly the beginning I think it would have heavily influenced language and infrastructure development going forward.

It's much better to be in the position now, when the language is barely used, than to have to support it later.


The mono C# compiler has been written in C# from the start, and this proved to be very helpful. I remember at the conference where Anders Hejlsberg first discussed Roslyn (which encompasses rewriting the microsoft C# compiler in C#), Miguel de Icaza was able to show some of the same ideas he was discussing actually already running on Mono. I don't think Miguel even knew ahead of time; it was just something the mono people had done.


I remember Anders saying that they just started with C++, because of the existing toolchain.

If it was today they would have done otherwise, based on current experience.

Well, at least Rosylin is going to be available in an upcoming .NET version.


Besides the immediate appeal of bootstrapped compilers, we're also going to have a tool to turn C code into Go code when this is done. I imagine this might breathe some new life into some stagnant projects.


It might be useful, but I suspect it will still be a lot of work to expand the translator to be generally applicable. It's a fair amount of work just to get such a translator to operate on a specific code base.


The C-to-Go transmogrifier (my phone's autocorrection of "translator" :) well be an interesting artifact, but what other C programs are good candidates for Go? Does a Python-to-Go translator exist? Now that could be a very interesting tipping point for Go.


The fact that Python is dynamically typed would hurt. I suspect you'd end up with a lot of `interface{}` everywhere.


Well, given that there is a Python-to-C translator (As long as your Python code is of the Cython variant).....


Interesting development, but to be a bit crusty, statements like:

"It is easier to write correct Go code than to write correct C code."

and

"It is easier to debug incorrect Go code than to debug incorrect C code."

and

"Go is much more fun to use than C."

Really are very subjective and don't add much beyond setting a POV for reading the article.


I've been a C programmer since '93-94, and it is my absolute favorite language, and I'm not sure how anyone could suggest that any of those 3 statements are false. Large-scale correct C code is very difficult to build and debug.

Subjective or not, since the whole project is about eliminating C from the codebase, they seem very relevant as well.


In the last three years, for the day job, I've written code in C, C++, Objective C, Java, NodeJS, and C#. A recent project that was back to C was really a breath of fresh air. Simple syntax, sure you have to pay attention to memory, but you have full control and need to pay attention, at the same time very little is hidden from you. Personal projects, I prefer Python but dabble in other things as needed.

I evaluate the task and the language options. If I needed to prototype out a web service where the server was up to my control, I'd likely shy away from C/C++. In cases where I was writing a module, it would depend on the architecture/framework. Anything low-level or system related, C is my goto well above C++.

Stakeholders, external components, and team factor in.

I do love how simple and straight forward C is. Calling it hard to debug or write code correctly for is wholly based on the person making such statements.


> I do love how simple and straight forward C is. Calling it hard to debug or write code correctly for is wholly based on the person making such statements.

The statements were that Go is easier to write and debug than C, not that C is hard. Considering that Go is memory safe and a significantly simpler language, it's hard to argue otherwise. As for the person making those statements, you should take a look at Russ Cox's resume.

> Stakeholders, external components, and team factor in.

Indeed they do. In this case, the stakeholders use Go, the external components are written in Go, and the team is the Go team. Go seems like a good fit.


The context of the link and the intended audience (go contributors) are not obvious within the context of a link posted to HN. Sure, you can read through it and look around, but most will read the initial document with their own mindset and experience, so a lack of context is problematic.


I can see how lack of context is problematic for people who simply must share their opinions without delay.

But even without the context, your comments are in response to a misreading of a very small part of the overall document. Hardly seems worth your time.


I am alarmed by anyone who believes C code is easier to build and debug than a modern memory-safe language. C does indeed make it pleasant to build programs that are fast and seem resilient. But the best secure C programmers in the world still routinely manage to get things wrong. qmail, for instance, had an LP64 overflow.


You mean the first 2 I take it. It's pretty obvious fun-level is blatantly subjective.


I have a good knowledge of C off last years and recently started programming in Go. I find the later to be much more fun doing.


Those statements are entirely subjective and, frankly, come off as filled with more than a little hubris, not to mention an air of pettiness and childish language-flamewar-bait.

Correct large-scale programs are very difficult to build and debug, full-stop. It isn't a "feature" unique to C, or any other language.

Likewise for correct code, and incorrect code: they're both actually trivial to do in any language in the small scale, and see above for the large scale.

That said, I view the effort to make a Go compiler using Go is something akin to "eating your own dogfood."


So...it's worth keeping in mind that some of the creators of go are some of the same people who helped create C and UNIX (Rob Pike and Ken Thompson, specifically are two of the three founders of go).

They have earned the right to "hubris". And, if they believe they've made a better, easier to debug, more fun language than C (and they do believe that, as it's clear based on their own discussions of the language), I don't think I'd feel any right to argue with them.


Consider the audience: this document was written for the contributors to the Go project. It wasn't intended to be on the front page of Hacker News. It's not trying to convince anyone that Go is good and C is bad. It's just explaining the rationale for the move to Go. For the intended audience, these statements are pretty much self-evident, or we wouldn't be working on Go in the first place.

Also I think you severely underestimate the benefits of memory safety.


C is around 40 years old. Go is about 4 years old. If Go isn't easier to write correctly, easier to debug, and more fun to write, then it's got to be the most tragic and wasted programming language effort ever.


Uh... It's Rob Pike and Ken Thompson...


Russ Cox wrote the document, and he's written more than his fair share of C code.


It's an introduction to a Go change proposal, stating a premise that the Go devs share even if you don't; it's like the intro to a C++ change proposal referring to C++ as a good embedded-systems language and the STL as a versatile, easy-to-use data structures library. You can argue, but that's not really part of the conversation about the change any more; you're going off and starting your own conversation about what languages people like.

(The plan to transition over by automatic translation, not the bullet points up top, is the interesting part of the document to me. I hope they're able to achieve it without too-serious compiler-performance regressions, since I really like the zippy compilation I get now.)


I don't know why you're surprised or grumpy at the fact that someone who did a lot of work on designing and implementing a language likes that language more than others.


I don't really get the motivation behind this effort. I'm not very familiar with the Go compiler, but it seems like make-work.

> It is easier to write correct Go code than to write correct C code.

But the correct C code is already written.

> It is easier to debug incorrect Go code than to debug incorrect C code.

Is the C code incorrect? How much debugging is left to be done?

> Work on a Go compiler necessarily requires a good understanding of Go. Implementing the compiler in C adds an unnecessary second requirement.

The compiler is already implemented in C.

> Go makes parallel execution trivial compared to C.

Is it not trivial to run multiple C compilers in different processes?


> Is it not trivial to run multiple C compilers in different processes?

I think the design doc is talking about using shared-memory concurrency to do parallel codegen on a per-function basis (like some LLVM patches are experimenting with doing). In this regard it's easier to set up the infrastructure needed to farm out concurrent tasks in Go, because C has neither built-in channels nor the generics to conveniently build them.


The compiler is still in a relatively early stage of its development. At this point we feel we can make faster progress and attract more contributors if it is written in Go.

(Go programmers like to write Go code, who'd have thought? :-)


Well, "attract more contributors if it is written in Go" I think that it isn't the point really. In fact, I think it's more easier if you're using C than or even C++. The question about C it's much error-prone language which would make development slow than using a more modern language with the C string and malloc()s pain. Not mentioned that it's cool a Go compiler written in Go and C# compiler written in C# :)


Good point about attracting developers. I guess Ken and Rob will eventually want to seek other challenges and leave Go to the youngsters and free software community.


Neither Ken nor Rob actively work on the compilers today. There are about 6 people in the Go community that regular hack on the compiler, yet the project has more than 100 active contributors. One goal of this change is to make the compiler sources more accessible to them.


I think they left off an important reason, which is that if an engineer is spending most of their time working on Go, and Go's written in C, then they're making decisions about Go without actually spending a lot of time writing Go programs. Better if the people driving Go are spending all day writing Go programs.


The compiler certainly has bugs. It has features yet to be added. It could be faster, use less memory, and be friendlier.

Your arguments would hold a little weight maybe if there were no plans to continue Go development, but I doubt that is the case.


On HN we get tons of "why we moved from language/platform X to language/platform Y" articles. They are fun to read, but 99% of it should be read as a list of post factum rationalizations of irrational decisions.

You have to give the Go team credit for telling the truth in their last bullet point:

Go is much more fun to use than C

I actually think the other bullet points are correct. I just don't believe they are the reason for making the decision.


I can't remember where[1], but someone said something like "if I did write this in C I would never dare to touch it ever again". You don't want to feel stuck in a dead-end.

[1] I think it was the scala compiler video rant recently published, or maybe SPJ on GHC.


If you find it please say so, that sounds like something I'd be interested in reading.


(Update: TLDR --> Modifying the Dick Sites quote from the original article: "I would rather write programs to help me refactor/transform programs than refactor/transform programs." So, a programming language highly amenable to that would be very powerful).

I just had an insight while reading this. It's a powerful insight. (I am surely not the first person to have this insight, but it's WAY ahead of being mainstream).

A programming language that could be PROGRAMMATICALLY REFACTORED would be a HUGE home run.

I know Lispers will jump in here. But I know Lisp (somewhat), and it does not have what I'm talking about. (Hell, maybe Go does ... if so, I finally get what the fuss is about).

The idea is that you could develop a huge codebase in this language, and then you could write code (probably in some other language) that REFACTORS the original codebase. Of course, you CAN write translators, as the original article mentions. But I want a whole new programming language that is designed from the GROUND UP to be amenable to programmatic transformation.

NOT just Lisp with its AST's and macros. The "transformation language" should be able to understand the following aspects of the original code: its modularity, its test coverage, which parts are functionally pure/impure, which parts are parallel/not, and the full compiler-level semantics of every piece of text in the code (in other words, what JetBrains knows ... this is a local variable name, this is a function name, etc.)

In other words, the original language has to capture more of the programmers intent (probably by inferring most of it). The intent-information is mostly or completely unneeded at runtime, but it is VITAL at automated translation time.

Imagine having such a codebase, and being able to pull up a REPL and interactively start changing the modularity of the code by issuing commands. Or telling the transformation system to parallelize some portion of code that wasn't parallelized previously. Or saying something like, "Take all the code snippets that instantiate the xyz data structure, and change them to call this function instead."

Don't miss my point -- we have features like this here and there. Some IDE's more than others, some languages more than others. Nothing new there. But I'm talking about a new programming language designed from the ground up to be highly amenable to this kind of interactive, automated transformation. In other words, I think this might be the killer feature that allows one programming language to outcompete most others.

This might be the one programming language feature that we should focus on now, more than any other.


This is exactly the field that I'm working on, so I'm quite interrested by this.

In my opinion, the problem is more of a tools problem than a language problem : One language that has one of the most interresting support for this kind of things is C, with coccinelle [1] [2].

Google has been working on absolutely amazing tools for C++ that applies programmatic refactorings in a distributed manner on absolutely enormous codebases [3].

This IMHO shows that this is a tools issue rather than a language issue. Coccinelle had to develop a swat of parsers for its project, and the Google project is using Clang as a basis for structural and semantic capabilities. C and C++, with their fragile type systems, preprocessor, horribly hard to parse syntax, might be the worse languages amongst typed languages to develop such projects on. And in spite of that, they are the ones for which such project exist, because they are the one with enough need for such tools.

Of course some languages are more amenable to such tools than others. Very strong static typing and a solid package system helps a lot. No metaprogramming helps too (because you don't have to handle the transformation).

I'm working on such tools for Ada, that is pretty much the perfect language for this as far as imperative languages go. The essential need for such tools is to have a compiler that exposes some services as an API, most notably the ability to explore the AST and query cross references for language entities. This was the fantastic insight of Clang/LLVM in my opinion.

[1] Coccinelle semantic patch language : http://lwn.net/Articles/315686/

[2] Presentation on refactoring with coccinelle http://video.rmll.info/videos/coccinelle-automated-refactori...

[3] Clang MapReduce -- Automatic C++ Refactoring at Google Scale http://www.youtube.com/watch?v=mVbDzTM21BQ


Interesting! I knew about clang and what Google is doing there. It is definitely related to the point I'm trying to convey.

I'll be taking a close look at Coccinelle. Thanks for the link.

And as you say, C and C++ "might be the worse languages amongst typed languages to develop such projects on," due to things like the preprocessor, hard to parse syntax, etc. So I'm saying, perhaps it is time to invent the programming language that is most amenable to this.

It's amazing what we can do if we put our best engineers onto the hard problem of achieving this with C / C++ code. Imagine what might be possible if we designed a language from scratch with the primary goal of making this easy. As far as I know, no programming language has ever been designed with this as its primary goal (I'd love to be corrected if I'm wrong).


I think it was either Grace Hopper or Barbara Liskov, that once made a statement that C entry into the mainstream computing has set the compiler development back to stone age.


> A programming language that could be PROGRAMMATICALLY REFACTORED would be a HUGE home run.

The go people are poking at this. They ship a tool, 'go fix', that can safely refactor your code to deal with backwards-incompatible API changes. Go also has a tool, 'go fmt', that most people use to enforce tabs vs spaces, etc., but has a 'rewrite' flag that you can use to do simple, arbitrary transformations.

They also ship some AST tools for go, though I haven't seen much use made of them.


One recent use of the Go parsing and printing packages is goimports, a tool that automatically adds and removes import statements based on the code you've written.

Try loading this snippet, checking the "Import" box, and clicking "Format": http://play.golang.org/p/jS4s_Xz26v

To use it locally:

    $ go get code.google.com/p/go.tools/cmd/goimports


I haven't delved into Go, but have occasionally "glanced" at it with curiosity. One of the most powerful principles I got a whiff of was that it was designed up front with the goal of providing a powerful tool chain -- an open parsing library and the like.

That provides some incredible power and may be the thing about Go that impresses me the most.

What if we also made programming language decisions about syntax, modularity constructs, etc., all with that goal in mind? (And perhaps Go has to a large degree).


I was on play.golang.org last night and didn't see that button.

When did that get added? That's brilliant!


We added it this week. :-)


Clojure has Slamhound. Yeah, I know you didn't want to hear about LISP...


Plus, I love Lisp! It even has a lot of the characteristics that would allow what I'm talking about. I was just trying to articulate that what I'm saying requires more than just SEXP's and Lisp macros (although that does go a long way).

And Slamhound is definitely barking up the same tree. Very interesting!


Yeah, SEXPs and LISP macros to a full refactor aware environment is like getting to Riemann surfaces from set theory.*

*The small problem with this example is that in fact the latter has been achieved.


I suspect that this is a very clever joke. And possibly that I might be the butt of the joke. But I suspect that even so, the joke is so good that I'd still laugh heartily at it.

Unfortunately, I freely admit that the joke goes right over my head. I used to know something about Riemann surfaces, so I could invest some time and eventually get the joke perhaps, but I just don't have time now. :)


Sorry, the chain is just

Set Theory Natural Numbers Integers Rationals Real Numbers Complex Numbers Calculus Riemann Surfaces

(as the Apple ads say, some steps skipped)

The only point I was making was that the level you were thinking at was much higher than the level SEXPs are at. Trying to envisage how we get there remains hard. TL;DR I was agreeing with you. :)


ennef is not charlieflowers (in fact he is Andrew Gerrand, one of the official Go devs).


Can this become part of Gofmt triggered by a flag? I love it, and feel it should really be part of an installation.


Goimports is a fork of gofmt, so you can use it as a drop-in replacement. We're considering whether to merge the functionality back into gofmt. Goimports is still pretty experimental, so this might not happen until Go 1.4, if it does at all.


Heck I'd be thrilled to have a smart identifier/type renamer. Maybe it's just me but I like things to be named properly but I tend to be very bad at it until its 3rd or 4th use. I also like short(ish) names, so many variables tend to have the similar names across scopes which makes it impossible to do a quick replace-all.

You should be able to specify a type name, identifier, or module and its' replacement (optionally file & line to be specific) via CLI. It should do the replacement automatically, and fail if it at all changes the semantics (e.g. clash or variable shadowing) and give an overridable warning on a non-semantics-changing possible problem (like an unused shadowed variable).

Sure, this is a normal feature for an IDE, but I feel like it should be simpler (and I don't like IDE bloat anyways). Maybe the tools you mentioned can already do this.


http://golang.org/pkg/go/ast/

this is a pretty powerful package.


This is the main thing functional programming theorists get excited about when going off the deep end about sufficiently smart compilers. (That and "proving a program is correct" which is a related problem). Basically all FP optimisations would instead work by automatically transforming a program.

The idea behind all these type systems is really to allow automated transforms which preserve the integrity of the program.


It's really too bad that Haskell, with it's referential transparency, doesn't have a way to rewrite things programmatically (well, none that I know of). Rewriting terms would be great in a language like that. It seems that when a typeclass has some laws, and you want to show that an instance of it obeys the laws, you have to bust out the pen and paper. Nothing wrong with that, but it seems like a language like that could support computer-aided rewriting.

Then there are languages whose evaluation is based on term rewriting, like the language Pure.

EDIT: You do have HLint, and the bot that rewrites expressions to point-free style.


It's close: http://www.haskell.org/ghc/docs/7.0.1/html/users_guide/rewri...

I acknowledge that is not precisely what you were referring to, but it is close.


I agree that functionally pure code could allow for amazing programmatic rewriting. Imagine coupling that with homoiconic syntax. Not necessarily to have Lisp-style macros, but rather so you could offer repl-based programmatic rewriting/refactoring tools, where the tools know that your code is pure and thus can be "reasoned about mathematically."


Alright, looks like I may be downvoted to oblivion. I still believe I'm on to something.

Would any downvoters care to take a minute to share what you think I'm missing? Seems like the polite thing to do. I know you're busy, so make it short and sweet if you have to, but if I'm wrong/naive/ignorant, please, help me out.


You're going on and on about how huge and important and valuable this insight is, like a kid saying "look how smart I am"; I suspect people are downvoting because of the tone rather than the content. Hell, the GRATUITOUS CAPITAL LETTERS in your post justify a downvote by themselves.

I don't think it's a great idea. It's been tried in various ways (you assert that C# wasn't designed from the ground up to be automatically refactored, but actually easy manipulation of the code by IDEs was a key design goal - not the only goal to be sure, but a major one), and it doesn't seem to have worked. But don't let me stop you; try and build something useful out of this. I suspect you'll quickly see the problems with the idea, but if it actually works then we'll all be very grateful.


Ouch. Sorry you read it that way. I don't mean it as "look how smart I am." I mean it as "wow, this idea excites me."

I think C# has more refactorability than most languages I've been exposed to. But I'm intrigued by the question of how much further could we go if we made that a primary goal.

Granted, it's just the merest kernel of an idea, and it's an extremely ambitious one at that. Not something I'm likely to tackle on my own -- more like a thought experiment at this point. Even if it failed, the attempt would be illuminating perhaps.

Anyway, thanks for leaving a comment.


I didn't read it that way. Not sure why someone showing enthusiasm would get downvoted here, but HN has a strange crowd.


If you are serious and passionate about this idea, I recommend digging in deeper and thinking of more specific examples of how you can improve on the status quo. You say that you want to design from the ground up to be more transformable than anything else, but without specifics this is just a goal, and it's not clear that a new language is necessary or would buy you anything.

For example, take your idea about wanting to be able to replace all instantiations with a function call instead. There's no need to invent a new language for this; the AST of a program in most any language would have enough information to write this refactoring capability into a tool. The hard part would be expressing the refactoring itself; how do you specify exactly the pattern you are trying to match, and exactly what you want to rewrite it to, in a way that is easy to use?

Maybe some other ideas you have would require a new language, but it's hard to say unless you are more specific about what you want. You might find that a lot of your ideas are implementable on top of existing systems. Designing "from the ground up" has a certain appeal to it, but to justify the extreme amount of effort a new language would take, I think you'd want to be more specific about what the clean break would buy you.


Thanks for the comment.

I agree that the AST of a program provides the raw material that could empower these kinds of transformations. But certain language features make it hard (like the C preprocessor, to cherry pick an example).

And I guess I'm exploring the question: What would a language look like whose goal was to make it easy, and to increase the set of possible automatic refactorings?

Maybe it's just a thought experiment that won't lead anywhere concrete. But to me, it's a very interesting thought experiment that I've never heard posed before.


I still believe I'm on to something.

I didn't vote at all, but you state:

In other words, the original language has to capture more of the programmers intent (probably by inferring most of it).

But I think that your comment fails to provide (1) an argument why homoiconic languages such as Lisp and Prolog do not fit the bill, since they are easy to rewrite (code is data); and (2) how this hypothetical language differs over languages that are easy to parse and have simple semantics, such as Java and C#, if most of the intent needs to be inferred.

I'd rather like to see better libraries for e.g. Java (Groovy provides the REPL :)) to do this, than yet another language.


(1) Homoiconic languages are actually a smaller scale instance of exactly what I'm talking about. They were designed from the ground up to allow for compile-time code rewriting, and because that was the primary driving goal from the start, they have amazing code rewriting capabilities that other languages cannot begin to dream of.

But the distinction I was trying to draw is that I'm talking about going further with that idea. Homoiconic syntax only allows for local code re-writing, but there are many other kinds of things we'd like to programmatically transform on a large codebase.

Half-baked examples: We'd like to replace all use of new/delete with smart pointers, but only in source files that are in one particular section of the codebase. Or we'd like to move all our unit tests from being in a separate package/dll, to being included in the same dll but surrounded by compiler pragmas.

(2) How the language would differ -- I don't yet know. But just as homoiconicity says "let's allow our end goal of automatic code rewriting to dictate our syntax", what I'm trying to say is, "let's allow our end goal of automatic refactorability / transformability to dictate all the programming language design decisions we make."

I think probably many of the things Raphael_Amiard mentions in this thread would be a part of it: strong inferred typing, perhaps, and a strong package system. Perhaps the ability to write some kind that was intentionally agnostic about how it would be packaged, and write some code elsewhere that says "take that code over there, and package it thusly." It would need to know if certain snippets were functionally pure (sort of how D does with "pure nothrow"), because then it would know it have more freedom with how it could transform that code.

Hopefully that clarifies the idea. It's hand wavy and high level, but it's a new (to me) idea that's only at the hand wavy stage so far.


Here's my question. What does "ground up" buy you? It is ideologically good, for sure. But it requires a viewpoint about future programming that is similar to existing programming: we'll write systems that are hard to maintain and are burdened with a lot of code doing relatively low-level data manipulations whose implementation can be automated.

But if our future programming environments are "smaller and sleeker" in all respects(as is anticipated by, for example, the VPRI work on STEPS) this would be the wrong optimization to make. The cost of maintenance will go down across the board because the new languages let us express the change with less effort, and then complicated refactoring becomes less necessary again.

(Counterpoint: We just build even more complex systems and then need better refactoring tools. An endless cycle...)


I love a lot of the principles behind VPRI and STEPS. But it strikes me as a very futuristic ideal, not something we're close to being able to use concretely and productively today.

And I get the irony! The exact same thing is true of my post. But I think I'm not trying to shoot as high as you are.

If we can find a way to make the VPRI/STEPS vision a reality, I'm all for it. I think what I'm proposing is a less ambitious interim step.


There is a reason there are so many languages extant. They each serve a purpose.

What exactly do you think you're on to? Insight has built all of the things that you're hoping to replace. So what are you bringing to the table?


> Insight has built all of the things that you're hoping to replace.

Who/what is "insight"? I didn't follow this sentence.

> So what are you bringing to the table?

If this has been done or thought out before, possibly nothing. But I'd be OK with that, because I'd be excited to see the results. If you have a link or a search term, please pass it on.


What OP talks about is only syntactic transformation which leaves semantics unchanged and probably "know" nothing about semantics of the code being translated, or might be being aware of a few standard ideoms, the same way macro transforms in Lisps do.

Transforming from one language to another is also old idea, some Scheme systems transforms (compiles) into C first.

In both cases the transformations themselves must be defined precisely by a programmer in advance. The idea that a program could transform semantics of another program is still a fantasy.) Even for Haskell.


I know, and I agree that having one program transform the semantics of another gets into the realm of fantasy.

I don't want to eliminate the role of human developers from the process. Instead, I want to provide tools that the human developer who understands these semantics can use to express the desired transformations, and tools that will help carry out those transformations across a huge codebase. Tools that "magnify" the efforts of the human who has the deep understanding of the before and after semantics.


That's why Steve Yegge said some article that compilers writers should have a configuration in their compiler to export the annotated AST. Then the tools would get better quickly.


Many languages can already be programmatically refactored. Pretty much any .NET language, for example, and Java, of course.


Yes, I agree. I've spent many hours working in both C# and Java and using JetBrains products to help me shape the code.

But C# and Java weren't designed from the ground up around the primary goal of being amenable to such programmatic refactoring. So there are limits on what you can programmatically refactor.

But our industry has reached a point where "rearranging" large bases of working code might be a bigger, harder, and more important job than originally developing that code. So perhaps the most important characteristic of a programming language is its "refactorability."

(Update - I think this sums it up: Would you consider working in a clunkier programming language if you knew that it offered 10X more "refactorability" than the other languages you were considering?)

Anyway, thanks for the comment.


> Would you consider working in a clunkier programming language if you knew that it offered 10X more "refactorability" than the other languages you were considering?

I'd have a hard time believing that a more clunky language would be a more refactorable language.


Ha ha, good point. Probably not. But take Lisp's homoiconicity as an example. Some people regard it as less desirable syntax, and yet even a subset of people who feel that way still find that the macro system makes it worth it.

So, would you accept certain trade-offs in your programming language if it greatly magnified the programmatic refactorability of your code?


Well, in my case my projects are never that large - I'm just a designer who does small high-fidelity prototypes and lets people who are supposed to be more capable do the large scale implementations. So "being easy to make quick-and-dirty hack from scratch with" is more important for me.

Also, have you tried Julia? It's homoiconic but looks more like Python. Haven't seen any refactoring tools for it though.


Calm down.

>(I am surely not the first person to have this insight, but it's WAY ahead of being mainstream).

Great. Maybe you can actually bring it into the mainstream. With code.

Imagination is fun, but you're not doing more than imagining.


Oh, the downvotes make me feel good. HN hates its cynic culture, and at the same time, this guy gets a stupid immediate benefit of the doubt for his feelings.

He asked for real criticism, guys. His post is really high on imagination and low on delivery, and if you don't see that you should probably think about how many people have created our programming languages and contemplate doing better.

This person isn't having conversations with them.

Am I rude for requiring him to step up to his talk?

Edit:

>>In other words, I think this might be the killer feature that allows one programming language to outcompete most others.

This sentence alone is flagrant ignorance.


You're right, it is high on imagination and low on delivery. It's a looooong way from being concrete and actionable.

I can't step it up just yet -- all I have right now is the question, which I pose as a thought experiment.

This conversation is more akin to some physicists speculating on the limits of efficiency of solar cells, than some engineers talking about a concrete plan of action.

>>In other words, I think this might be the killer feature that allows one programming language to outcompete most others.

> This sentence alone is flagrant ignorance.

Not really ... it's just awkward wording on my part. I don't really believe this will lead to the One True Language to Rule Them All. I was just trying to say that it might be well worth sacrificing on certain language features in order to maximize refactorability.

So ... I don't agree with you, but I do appreciate you leaving a comment.


Cheers.


Go is already extremely fast to compile, I'm puzzled that they choose to improve this aspect of the compiler, which nobody really complains about, while ignoring elephants in the room such as generics and a proper exception mechanism.

It's almost as if they choose to tackle the easy problems instead of the hard ones.


I'm not sure why you classify "a proper exception mechanism" as an "elephant in the room". Go will never have exceptions for control flow, and it's one of the best properties of the language.


The language is not seeing significant development since Go 1 (deliberately). The libraries and tools are now the focus.

Generics might happen in Go 1 at some point. Exceptions will never happen. Explicit error handling was a design choice.


I used to have long discussions here about generics and other stuff with uriel.

However, I fully support this decision.

Bootstrapping is the best way to develop a language, as it allows the language designers to experience the language in first hand, while making it independent of other tooling.

Plus it is one argument less for C zealots against Go, in the sense of "my compiler compiles yours".


If they wanted every language feature, they'd use C++


> But I want a whole new programming language that is designed from the GROUND UP to be amenable to programmatic transformation.

Pretty much any modern statically typed language can easily be refactored.

The problem is not the refactoring, it's deciding when to refactor and to what (e.g. renaming). This is much more of a human than a computer problem, and as such, it will probably be intractable for a very long time.


I wonder. We have automated tools that can notice when a method is too complicated, or when two types are frequently used next to each other. There's a cool program that converts Haskell expressions to pointfree style. Would it be too hard to have a system that could suggest when several parameters should be grouped into a container object, or when a common piece of logic can be factored out of several methods? I think it's worth looking at.


And it can come up with good method names?


I'm most excited about Phase 4b. The Go compiler currently leaves a lot of performance on the table with its weak optimization capabilities. Having an intermediate representation and a few optimization passes will recover the majority of that performance.


This is really exciting. I can't wait to port some legacy C code using it. It should make it a lot easier than doing it by hand.


This explains why there is "goto" keyword in Go spec. Nobody uses it, except state machines and reducers for compilers. They probably thought they'll rewrite compiler in Go and didn't drop that keyword?


That's not why goto is there. Goto is there because sometimes you want to go somewhere and goto is the best way to get there.


I once had a book, in coffee table, landscape format that was titled "Linux Core Kernel Commentary Vol 2".

In the annotator's introduction it stated something like "I haven't seen this many goto statements since I was writing BASIC as a teenager!" but then went on to explain why sometimes they're the right tool for the job.

I'll have to order another copy. It was an amazingly varied source of knowledge!

[1] http://www.amazon.com/Linux-Core-Kernel-Commentary-Edition/d...


I wonder why this is not a more wideread notion. Goto is useful, in all languages where it is defined. Can be misused? Sure, just like you can write switches with ifs or use while in all loops (talking of C here, not Go... I'm kind of glad I only write for or ranges now.) It's just that goto has bad fame, and ifs and whiles don't


It's because "goto considered harmful" is a catchy meme, and also many older programmers have had understandably bad experiences with code that abuses goto.


I suspect the percentage of programmers who actually had bad experience with goto is now vanishingly small. I'm not downplaying the harm that goto can cause - but to have experienced it, you have had to experience old basic, fortran 77 or COBOL before learning C or Java. That probably describes 0.1% of people doing programming. Maybe less.

The averse reaction to goto is conditioned response, not learned experience.


I learnt programming with a book named "Basic para niños" (Basic for kids when I was ~7-8, around 23 years ago) and with the programmers manual for GW-Basic. So goto was an inherent part of it, but as soon as I learnt other languages goto lost a big part of its significance. I read a lot of code full of gotos (back in my Basic days) but afterwards only the occasional goto would pop up. When I learnt C at my university for numerical stuff, we weren't taught goto existed (I found it out when reading K&R 2-3 years later, when I wanted to get a better understanding of it.) Goto is awesome in a lot of cases (where something alike to break label; is not available), but as you say, the conditioned response is "goto is bad." And it's pity, because nothing is inherently bad, most often it's just the user at fault.


Where was that? In 1990 in the US and UK and France (possibly other places - that's where I'm aware of), Basic was already on the way out - and that was the "good" basics like BBC Basic (with procedures, local variables, WHILE/WEND, REPEAT/UNTIL, named labels, proper error control with ON ERROR - I think of these, GW Basic only had "on error").

It was already Pascal in academia and "regular" insdustry, C in low-level stuff and FORTRAN 77 in science stuff. Of course, there was still legacy Basic and legacy COBOL, and you were much more likely to run into a historical mess of GOTOs. But still, it was already past the Acceptable Spaghetti era.


Am I misunderstanding this comment? Golang has a "goto" like many other languages. There's a goto in my source tree (in a lexer, to handle the newline terminating a line comment); there are also plenty of gotos in the go standard library. "goto" can be useful.


I have also used goto a few times in Go. There are occasions where it makes the code more obvious than the alternatives.

A lot of people cargo-cult Dijkstra's hate on goto. It does have its uses as long as you're very selective about when to apply it.


I've used it. It's uncommon, but occasionally useful.



Very interesting. The cat-and-mouse problem mentioned a couple of times - finding a solution that allows development on the old to continue while still working on the new - is very similar to a problem I had in a previous job.

We had an Actionscript 2 application that was under constant development and use on the web, and we needed/wanted to port it to Actionscript 3 to make use of new language features. I would write a tool that would attempt to translate between the two automatically, and then any more complex transformations could be hand coded and/or added to the translation engine.

Of course, the project did not succeed during my time at the company, mostly because the scope of the translator was so limited; there were too many concepts in AS2 (the borrowed javascript setInterval, for example) that had no equivalent line-for-line in AS3 (where you had to use the new Timer class to delay/loop execution) and therefore couldn't be simply substituted. Perhaps if the parser had tried to represent higher order concepts than files and lines of code (e.g. classes, instance data, methods) the project would have got further - but we simply didn't have the resource to spend on it, nor the ability to freeze features on the AS2 codebase.

As it was, the AS3 version of the app never quite reached feature parity with the AS2 version, which raced ahead too quickly for the new features to be added by hand in AS3 (and too many changes had already been made by hand to make re-running the parser viable.)

If Go is indeed simple enough to translate into from (a specific subset of) C, and if Google can afford more than one engineer for a week to write the tool, then this sounds like it's off to a better start than my project was.


> There are many corners of C that have no direct translation into Go; macros, unions, and bit fields are probably highest on the list. Fortunately (but not coincidentally), those features are rarely used, if at all, in the code being translated.

Sounds like they had this idea in the back of their minds from the beginning, and wrote the C code accordingly.


That's an interesting causality. It implies that in 1990 when Ken started writing the compiler he knew it was going to be converted to Go. I doubt it.

The more likely explanation is that the same people wrote the compiler in C and designed Go. The subset of C they found most useful is the one they used in the compiler and the same one whose semantics ended up in Go.


Plan 9's userspace was originally written mostly in Alef, a predecessor of Go, then translated to C. I wonder if the original version of the C toolchain was in Alef, too. Unfortunately I don't think there is a easy way to check.


Not true. Plan 9's user space was written in C. Alef came later and was used for only a few programs. The bulk were still in C, including the compiler toolchain. The Alef compiler was written in C too, never in Alef.

In the second edition Plan 9 release, the last one with Alef, these programs were written in Alef: acme, aux/consolefs, aux/depend, httpd, md5sum, page (document viewer), postscript/tcpostio (driver for lp), and ppp.


I think he was suggesting that you wrote the compiler to use only the parts of C that could be converted to go, not that C was written to be convertible to go.


Yes, and my point is that parts of the compiler date back to 1990.


A compiler with no unions is surprising. At least for parsing, I've found unions to be almost indispensable.


Really?! Last time I used unions was probably in MS-DOS.


Alas, that means "Plan 9" bits in go is going to vanish. Luckily some of Plan 9's legacies (saner stdio, simpler socket interface, channels) are embodied in go itself and hopefully getting into mainstream.

9P was not so lucky (it's more of an OS thing than a language thing), but I wish it could take off somewhere.


Noice. Business case aside, that sounds like a fun exercise for those with the chops to tackle it.


Step 1. Write a tool that translates the C code to go.


In C or in Go?


In C, then translate that to Go of course ;)


It’s interesting that it’s not mentioned. I might suspect it’ll be C, as otherwise they’d need to build a new C parser in Go to even get started. Why not use a mature one, and add the Go transformation?

Edit: to be more precise, they’d need to get the C syntax tree in some representation first. Then pass it over to a transformer. The latter could be in Go, but it seems foolhardy to reinvent the former.


By the time I reached this post, the number of translation tiers managed to confuse me. To help others, here's a breakdown.

Notation: T[A,B] : Translates from A to B

The OP wants to develop T[C, Go]. The question is which language should T[C,Go] be written in?

Option #1: If T[C,Go] is written in C, then it can use an existing C parsing library to parse C and then emit Go. Here parsing will be easier, but emitting Go will be laborious to write in C (since C is very low level).

Option #2: If T[C,Go] is written in Go, then the parent claims that C parsing will have to be freshly developed in Go. This will be a pain but the Go emitting code will be easier to write in Go.

However, I think the problem with option #2 (parsing) can be avoided if one uses the parsing library that is assumed to be already available in C. Since Go can interface with C code[1], the parser need not be reinvented.

[1] : http://golang.org/cmd/cgo/


Even if written in C, generating source code is trivial compared to parsing. The translator doesn't have to output the prettiest code either, since there's gofmt/fix.


Clearly the sane route would be write T in C then once complete run T through T to emit a C to go translator in go.


The parser is the only part I've written so far. It took me an evening (Go has yacc).


Perl.


Presumably in go, though that's not explicitly stated.


In Go.


The mention of porting the rest of the plan 9 system to Go is the most interesting part of this document. Is this just academic, or does Google actually want a complete new system (kernel and all) implemented in Go?


they're not talking about the plan 9 OS. Go ships with forked copy of several plan9 libs (lib9, libbio, libmach) & the plan9 c compiler. The plan is to obsolete the use of all of that code.


In this context, Russ is referencing the plan 9 libraries and compiler that is currently part of the Go source distribution.


I don't think the tool to automagically convert C code to Go will work. At the very least it will take much longer than expected to get it right. And when it's finally ready, he resulting Go code will be very slow, so they won't be able to use it without serious manual modification. Meanwhile, the C compiler will continue to add new features, and each one will have to be converted to Go and then refactored for performance. It won't work.


There's nothing magical about a compiler. If the Go team decides to build a C compiler using Go that targets Go instead of machine code, they will.

Worse case scenario the resulting code will be a GOTO littered state-machine with 'import "unsafe"' on every source file.

Though I like to think the Go team would fare better then that.


They aren't writing a general tool from C to Go. They are writing a specific tool to convert the compiler from C to Go. They will then make major modifications, but it's much easier to do that from working Go code than from scratch.


Why not start from scratch in go? C and go are different programming languages with different standards etc. Why rely on shoehorning the C version into go?


There is a large communication overhead to this. You have to start a new project, which could take months to develop. In the meantime you still need to maintain the current compilers, fix bugs and make improvements but all that effort becomes wasted. People stop working on the old code base because your new code base is more fun so it suffers and eventually you end up with a not-yet working new project and a broken buggy old project.

With a translator work doesn't have to stop on the current compiler code base while the new version is developed, all that effort is saved and bugs are automatically fixed in both places so you don't have to maintain and synchronise two bug trackers.


From the document itself: "Despite Go’s simplicity, there are many subtle cases in the optimizations and other rewrites performed by the compilers, and it would be foolish to throw away the 10 or so man-years of effort that have gone into them."


This strikes me as dubious. Rewriting from scratch would throw away the C code, but it would not have to throw away the lessons learned from and recorded in the C code.


Yes, this is indeed one of the questions that the article addresses in the "Alternatives" section.


That is not an article, it's a design document.


Fascinating and fun. I really hope perf doesn't regress much from this: the zippy edit-compile-run loop is part of why Go's fun to use.


On the contrary, these changes will allow us to more easily parallelise parts of the compiler which should yield compilation time improvements.


"The Go compilers are also significantly larger than anything we’ve converted: over 60,000 lines of C."

Couldn't the compiler translation just be crowd sourced by a small group of people? It wouldn't be the best result but if people just stuck to the original C design, it would work as a first implementation.


I'm inclined to agree. It seems that it would be tough to be confident in the correctness of the generated code, let alone the performance. In the end, getting a team on it would seem to be the most likely to produce correct, readable, idiomatic code. But what do I know.


Not continuously. The compiler will continue to be improved, and with a working translator developers never need to worry about the bootstrapping problem while developing the Go compiler in Go.


The D community chose the same route. Currently [0] it seems to work in Win32 already.

Meanwhile, development on the C++ implementation continued. This is probably the biggest advantage, as the porting does not slow down language evolution.

[0] http://forum.dlang.org/post/l8uccc$1qkr$1@digitalmars.com



This are great news. I always defend bootstrapping is the proper way to write compilers for general purpose languages.

Looking forward to the day Go's runtime is free from C code.


Go will never be "free from C code" it will always use C code as the bootstrapping compiler will be C. All their doing is making the new compiler in Go and using the old C one to compile the new Go one.

And I'm not sure Go qualifies as a "general purpose language".


> And I'm not sure Go qualifies as a "general purpose language".

I feel somewhat foolish for asking, but how do you define a "general purpose language"? I've seen Go used for building internet servers, command-line tools, games, scientific computing, mass data processing, and more. Sure, there are some things Go is not well-suited to, but I think it does enough (and well) to qualify as "general purpose".


I guess you haven't read the document properly.

They plan to keep around the C based compiler for the time being until the whole process is finished.

Afterwards there is the option to have a backed that generates C code as target, although other approaches can be taken as well.

They just need to replace that backend by other one that cross compiles.

C is nothing special, I already wrote a few compilers without a single line of C.


Is translating seriously the best option? Are the translators accurate enough?


The translator written for this specific task will be accurate enough to perform the task. The compiler already has a ton of test cases, and we will likely add more in the process.


Anyone have a non-Google Docs link? It isn't loading on my end.



Thanks!


Now if only they'd do something about that garbage collector.


How would this affect the Plan9 and Dragonfly ports?


So, can Go channels be implemented using channels?


This is the best argument to go D.

The D compiler is waaaaay better, written by a compiler expert.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: