Removing support for Emacs unexec from Glibc

nuxi7 · on Jan 30, 2016

For the curious, the need to change the malloc behavior is not hypothetical:

https://sourceware.org/bugzilla/show_bug.cgi?id=6527

Basically for many years now glibc has been knowingly doing the wrong thing just to keep malloc_set_state/malloc_get_state working. Which as far as anyone knows, is used only by emacs. Keeping this interface alive for the sole benefit of an emacs optimization when it is blocking the fixes for fairly serious bugs is a pretty big bar to meet for justification.

In the end this is largely a non-issue as an Emacs developer noted that their configure script probes for the special glibc malloc API and if it doesn't exist then emacs will use its own malloc implementation. The conclusion is that once the glibc devs get their changes up, everyone can meet up again to make sure existing emacs binaries still run and that the API detection in the emacs configure script works right.

So props to Paul Eggert for being the only sane man in the room and pointing out that the drama was all for nothing.

yosefk · on Jan 30, 2016

Actually, having a generic way of saving the program state into an executable that then runs from that point could be really useful, and some Unix versions had it (and Perl used a Unix utility called undump to implement a "Perl compiler" where it dumped core that could be "undumped" into a running program later; that's the same idea as "unexecing" program state into an executable.)

Freaky · on Jan 30, 2016

It's notable that the Solaris unexec() implementation[1] is basically a thin wrapper around dldump()[2].

1: http://git.savannah.gnu.org/cgit/emacs.git/tree/src/unexsol....

2: https://docs.oracle.com/cd/E19455-01/806-0627/6j9vhfmop/inde...

johntb86 · on Jan 30, 2016

http://criu.org/Main_Page

Freaky · on Jan 30, 2016

Also: https://www.dragonflybsd.org/cgi/web-man?command=sys_checkpo...

protopete · on Jan 30, 2016

I logged in to say the same thing. More info here: http://perldoc.perl.org/functions/dump.html

tenfingers · on Jan 30, 2016

Approaches similar to unexec() are quite common in lisp-like environments, so emacs doesn't come as a "surprise" here.

There's no reason emacs needs to use the glibc's malloc implementation though, especially given emacs has it's own built-in (and probably neglected) malloc for other platforms.

mcguire · on Jan 30, 2016

Yep. Emacs builds and runs (and presumably unexecs) on many non-glibc platforms. This is more likely an optimization for it on glibc.

"According to Paul Eggert, making unexec more portable has been on the to-do list for a while, "and this will light more of a fire under it". Concerns that Emacs might not build using a new Glibc API (which has not even been written yet) that came up earlier in the thread are not a problem, he said. "Emacs should still build and run even if the glibc API is changed, as Emacs ./configure probes for the glibc malloc-related API and falls back on its own malloc implementation otherwise.""

And there's the dreaded autoconf doing it's job.

justincormack · on Jan 31, 2016

Well no it does not unexec on other platforms. And the figures in the article suggest it is annoyingly slow to start on other platforms. This is all about making the Gnu editor work well on the Gnu OS. It builds and runs elsewhere but startup is slow.

mwcampbell · on Jan 30, 2016

I think it's wrong for any application to be this intimate with its host platform. In my opinion, Emacs should treat libc as a black box on GNU/Linux, just as it presumably does on Windows, OS X, and other proprietary platforms. But I admit that my opinion is shaped by being a user and developer of proprietary software, where binary compatibility is important. So I'm not surprised that RMS apparently disagrees.

Decade · on Jan 30, 2016

The host platform has no reason to exist except to run programs. And then functionality that turns out to be widely useful goes into the standards.

So, it turns out that unexec has not made it into the standards, and now is causing more problems than it’s solving, and the glibc maintainers are going to remove it.

This would not be an interesting story, except that it involves Emacs, Stallman, and the appropriate use of mailing lists. Still, the conclusion of the article is that the process works and everything’s just going fine.

xxpor · on Jan 30, 2016

It may not be an interesting story from that perspective, but from the perspective of an emacs user, I had no idea that emacs did this. Reading about how it works has been super interesting (with a healthy dose of wat thrown in). I'm glad it was posted.

TheCondor · on Jan 30, 2016

It's worth compiling emacs and understanding it all. It is interesting.

semi-extrinsic · on Jan 30, 2016

So let me see if I get this. The build process runs a huge autoconf etc., then compiles from C code a proto-emacs, which is then executed and loads up a bunch of Lisp code, then a (self-described as) ugly hack is used to dump that running program (including the loaded Lispy-bits) as a new executable, which is your editor? Wow.

caf · on Jan 31, 2016

Binary compatibility is also important to glibc, which is why they go to the trouble of carrying stubs that implement the older ABI for any functions that change ABI. Binaries linked against older library versions will continue to work with newer versions, through the magic of symbol versioning.

This method will presumably be used to carry the old malloc implementation around for older emacs binaries that rely on it (whereas newly linked binaries will see the new API and ABI).

zaphar · on Jan 30, 2016

You aren't wrong given the current development landscape but Emacs is a very old piece of software and this sort of thing wasn't unheard of when it first was given life. (I wasn't really around but anecdotal stories from the time period seem to support this.)

This is just an example of technical debt handled well actually. The loan came due and now the emacs devs are doing the right thing and paying it off.

malkia · on Jan 30, 2016

console video games are very intimate with their host platform, to the point where 6 years old GPU/CPU combo produces better results than lots of newer PC's.

I'm not advocating that this should be the norm, simply noting that there are cases where you need to be.

Otherwise you can't achieve smooth 60 or 30fps, without tearing, popping (audio, streaming, models), etc.

mrob · on Jan 30, 2016

That level of smoothness absolutely should be the norm. And not just 60fps, but ideally 120fps/144fps/higher (eg. new DisplayPort 1.4 240Hz monitors when they are released). Every dropped frame, every timing inconsistency, is a small attack on your concentration. When a tool acts quickly and predictably you can treat it as a part of your own body (see https://en.wikipedia.org/wiki/Body_schema ). When it responds slowly or inconsistently this requires conscious effort that could be better spent on solving the problem. For example, every libvte based terminal emulator is capped at approximiately 40fps. 40fps isn't even an integer multiple/divisor of any common refresh rate! It annoys me every time I see it, and it's symptomatic of the casual attitude to UI responsiveness that has modern software running worse than some ancient MS-DOS software did on a 386.

ma2rten · on Jan 30, 2016

I think it's common for applications that are developed by same organization as the OS. Windows also has private APIs for some MSFT applications.

undergrowth54 · on Jan 31, 2016

> I first heard about this from one of the developers of the hit game SimCity, who told me that there was a critical bug in his application: it used memory right after freeing it, a major no-no that happened to work OK on DOS but would not work under Windows where memory that is freed is likely to be snatched up by another running application right away. The testers on the Windows team were going through various popular applications, testing them to make sure they worked OK, but SimCity kept crashing. They reported this to the Windows developers, who disassembled SimCity, stepped through it in a debugger, found the bug, and added special code that checked if SimCity was running, and if it did, ran the memory allocator in a special mode in which you could still use memory after freeing it.

http://www.joelonsoftware.com/articles/APIWar.html

asveikau · on Jan 31, 2016

Not sure about how it was in the pre-DOJ days, but in recent Microsoft history there is a company policy not to ship non-OS binaries that use private Windows APIs.

quotemstr · on Jan 30, 2016

IMHO, the XEmacs approach of using a portable dumper is the better option. I prefer that one not only because it frees us from having to worry about the details of executable formats, but because the same machinery that lets us find and relocate all the intra-lisp-heap pointers would also let us write a compacting GC, which is important for long-running programs like Emacs.

weinzierl · on Jan 30, 2016

> [..] fixing Emacs, which is seen as the only user of the interfaces:

If I remember correctly TeX uses the same trick. I don't know if it depends on unexec() though.

coliveira · on Jan 31, 2016

From what I remember TeX dumps the processed macros into a binary file. Later, TeX loads those preprocessed formats quickly, without the need to parse again something as big as LaTeX.

jbssm · on Jan 30, 2016

I wonder if there is any project starting to re-write Emacs according to modern practices like we now have NeoVIM for VIM.

I really like editing in VIM but I moved to Emacs using Spacemacs to try and get the best of both worlds (org mode for scientific research seems a really cool approach).

Still I see Emacs as being quite more bug prone than VIM and VIM than NeoVIm.

Perhaps this is what's needed for someone to start a complete refactor of Emacs and bring it (I mean the code not the editor) to the modern age.

dmm · on Jan 30, 2016

I don't really see that as being the case. Emacs development is very active, even at the low levels. The Guile scheme implementation now has an elisp implementation and there's a good chance that emacs will be moved to it eventually.

If you encounter bugs you should submit bug reports.

wtbob · on Jan 30, 2016

> The Guile scheme implementation now has an elisp implementation and there's a good chance that emacs will be moved to it eventually.

FWIW, I think that's entirely the wrong direction to go. Emacs should be ported to Common Lisp, with an elisp compatibility layer.

Scheme's a neat didactic tool, but anyone who wants to produce production-level software in Lisp should write it in Common Lisp. Heck, even Schemers recognise that, which is why the RnRS controversy exists.

pmoriarty · on Feb 1, 2016

My own view is that Scheme is exactly the right way to go and CL would have been the wrong way to go.

Modern Schemes are fully as production-ready as CL. The view of Scheme as merely an educational tool is outdated, and usually based on limited experience with ancient, bare-bones Scheme implementations such as MIT Scheme.

Modern Schemes like Chicken have fairly extensive collections of practical libraries and features that barebones Scheme implementations lack. To add to that, Scheme is far more elegant than CL, and doesn't contain all of the ancient crud of CL, so is far more pleasant and easy to program in.

I'm not thrilled that Guile (rather than Chicken) is the Scheme of choice for Emacs, but it's a far better choice than CL.

That said, even CL would have been an enormous improvement over elisp. So the sooner the migration from elisp starts (whether to Scheme or CL), the better.

wtbob · on Feb 2, 2016

> Modern Schemes are fully as production-ready as CL.

Modern Scheme still doesn't have hash tables (c.f. R7RS[1]; of course Lisp has them). That, right there, prevents it from being a production-ready language. I could omit the rest of this post and I'd be right.

Scheme does have continuable exceptions, but it's still a far cry from Lisp's conditions and restarts.

Scheme's type system is extraordinarily lightweight. There's no way for the user to define new types, nor even a lightweight way to query for an object's type (unless I've missed something, one must use the various type predicates one-by-one).

Relatedly, there's no way to declare variable or function types; no way to pass that information on to an optimising compiler. There are no compiler macros. Indeed, compilation in general is woefully underspecified.

There's no object-orientation: no classes, no generic functions, none of that. One has to roll one's own if one wishes to.

Although it does have cond-expand, unlike Lisp Scheme doesn't specify READ-SUPRESS which works in conjunction with #+ and #- to skip variant syntax supported by other implementations.

This raises the issue of the reader in general. Scheme's reader is not extensible; it lacks reader macros. It provides no access to the current readtable, or any way to manipulate it.

Scheme doesn't even have a general (i.e., unhygienic) macro facility! Its hygienic macros are sufficient for many use cases, but not all — e.g. anaphoric macros.

Its iteration construct (yes, singular!) is severely limited. There's no general facility like LOOP.

It lacks settable places (this is the capability in Lisp of writing `(setf (getf 'foo bar) 'baz)`).

Scheme does finally have dynamic variables, although they are more unwieldy to use than Lisp's specials.

I do like its well-specified numeric tower.

> Modern Schemes like Chicken have fairly extensive collections of practical libraries and features that barebones Scheme implementations lack.

But they require that for even very basic functionality (like hashtables!). One of Common Lisp's downfalls is having to use implementation-specific functionality; Scheme is worse.

A related problem with Lisp is that Gray streams aren't part of the standard. But Scheme's ports are even less-specified than Lisp's streams.

> To add to that, Scheme is far more elegant than CL, and doesn't contain all of the ancient crud of CL, so is far more pleasant and easy to program in.

You know what I find pleasant? A language which anticipates my needs and my problems, and has already solved them. Time and time again I find that Common Lisp has done exactly that.

I'll certainly admit that there are parts of Lisp I'd change (the default upcasing is hideous; some of the function names are ugly; the varying argument order between similar functions is beyond lame). But I'd never want to use a language which treats NIL as true!

Scheme's not really a toy: it's clay which can be used by thinkers as well as students to play with problems. But it's not suitable for writing portable, high-performance, industrial-strength, real-world problems. Common Lisp is.

[1] http://trac.sacrideo.us/wg/wiki/R7RSHomePage

fidget · on Jan 30, 2016

> LWN subscriber-only content

kinda shitty

lultimouomo · on Jan 30, 2016

If you try scrolling down, you'll note you can actually read the article. LWN lets subscribers share (temporary) paywalled articles with anyone they like, even on public forums like HN. They are nice like that.

fidget · on Jan 30, 2016

Oh, I'm aware I can read it. I'm just not convinced about sharing it on such a broad public forum.

cesarb · on Jan 30, 2016

https://lwn.net/op/FAQ.lwn#slinks

> Where is it appropriate to post a subscriber link?

> Almost anywhere. Private mail, messages to project mailing lists, and blog entries are all appropriate. As long as people do not use subscriber links as a way to defeat our attempts to gain subscribers, we are happy to see them shared.

wtallis · on Jan 30, 2016

https://news.ycombinator.com/item?id=10236304

https://news.ycombinator.com/item?id=9540182

https://news.ycombinator.com/item?id=9486938

If LWN minded, they'd say so.

lultimouomo · on Jan 30, 2016

Sorry, I misunderstood what you meant.

Links to LWN subscriber content appear on HN from time to time - not too often, and I would guess they do more good than harm to them.

markrages · on Jan 30, 2016

Often such links are posted by LWN staff, so I believe they favor the practice.

tshtf · on Jan 30, 2016

The following subscription-only content has been made available to you by an LWN subscriber. Thousands of subscribers depend on LWN for the best news from the Linux and free software communities. If you enjoy this article, please consider accepting the trial offer on the right. Thank you for visiting LWN.net!

chris_wot · on Jan 30, 2016

So much for rms standing up to principle. When the chips are down, he resorts to a private mailing list to discuss and influence matters that impact multiple users. I though he was into transparency?

ma2rten · on Jan 30, 2016

I found it interesting that RMS wanted to discuss the issue in private given his position as a almost religious defender of free software. Isn't a part of free software that it is developed in the open?

jordigh · on Jan 30, 2016

rms doesn't think "free software" has anything to do with a responsibility for collaboration or openness. Those are slogans of the open source movement. He sees software as an ethical exchange between two parties. rms doesn't have a problem with privacy, secrecy, or refusal to collaborate. As long as Bob has the freedom to do whatever he wants with the software the Alice gave him (except, perhaps, restrict the freedom of others), rms doesn't think that Alice has any further moral obligations.

mwcampbell · on Jan 30, 2016

Not necessarily. Read Eric Raymond's description of cathedral-style development in "The Cathedral and the Bazaar" and keep in mind that he had Emacs in mind. It wasn't until Linux that an open, public development process became closely associated with free software.

camgunz · on Jan 30, 2016

Summary: emacs devs want glibc to maintain backwards compat in their heap layout so building emacs is 5 seconds faster.

roel_v · on Jan 30, 2016

No, starting emacs is 5 seconds faster - at least that's what I got from it.

ams6110 · on Jan 30, 2016

Agree -- but doesn't strike me as a big deal. For me, emacs is something I start and will often have running for months. 5 second startup time is irrelevant. That's no slower than starting Word or Excel, either. Even browsers take a couple of seconds to start from cold.

Scarblac · on Jan 30, 2016

Not everybody does, and not everybody who does that does it on all systems.

A 5 second startup time for a text editor is extremely annoying when all you want to do is, say, type some small commit message.

xxpor · on Jan 30, 2016

Agreed. Even as a regular emacs user, if I just want to quickly edit a config file on a remote box I just open nano or vim, because they open nearly instantly.

ams6110 · on Jan 30, 2016

For local quick edits of things like commit messages, you can run emacs as a daemon and use emacsclient as your editor command.

For remote files there's tramp.

But, I do agree that any sysadmin should know enough basic vi[m] to get around, because vi is almost always available on any unix system. Emacs (or nano) might not be installed.

borkabrak · on Jan 30, 2016

I agree that a five-second startup can hardly be said to break usability, especially for something that many people will leave running for a long time.

That said, though, if I was accustomed to one of my primary tools taking a half-second to start, I think I'd miss it pretty bad if it went away.

Just the fact that people on both sides (glibc and emacs) are treating this so seriously seems to indicate a rather admirable dedication to keeping the quality of the software as high as possible.

lukeschlather · on Jan 30, 2016

Personally, I use Emacs as a Notepad as well as a coding environment.

Also, if I'm screwing with my .emacs it's nice to have instant feedback beyond eval'ing any changes I make as I go.

mwcampbell · on Jan 30, 2016

IIUC, the 5-second difference is in starting Emacs, not building it.

camgunz · on Jan 31, 2016

Whoooops, it is startup. I should've been tipped off by the whole thing only taking .5 seconds with unexec. Mea culpa :).