Hacker News new | past | comments | ask | show | jobs | submit login
System XVI: A replacement for systemd (github.com/servicemanager)
191 points by vmorgulis on Sept 13, 2015 | hide | past | favorite | 142 comments



Oh boy.

Alright, I'm an acquaintance of the author, so let me clear things up.

SystemXVI only began less than a couple of weeks ago, and is still a very early stage project. However, the author (David Mackay) decided to be a bit of an expedient troll and advertise on /g/, to arouse some controversy. It ended up blowing up greater than anticipated, even ending up being mocked on Linux Unplugged and hitting /r/linux as the project was still a skeleton.

First thing's first:

This is not a joke project. Mackay is not some amateur, either. He's a former Solaris/illumos committer and most recently was involved in the Tox community. He actually wrote a small service manager called Charge last year as an experiment, and some of the logic is borrowed on to SystemXVI (though the architecture is different).

SystemXVI, the design is quite interesting. The closest equivalent would probably be Solaris SMF, but it's still different. init(8) is kept small and instead SystemXVI is based on the idea of the service repository (inherited from SMF) and delegated restarters. The repository is a hash table that stores instance information and service properties. It's useful for having a point of fault recovery and to introspect services in a dynamic, transient manner as they run.

The primitive process management is currently in libs16rr, but the actual supervision strategy is defined in restarters. Much like Solaris SMF, there is a master restarter, but certain services (i.e. local socket services) can, for example, opt to be watched by a delegated inetd-like restarter that uses the SystemXVI RPC interfaces. Or let's say you have a Java, Erlang or other service that needs special behavior, you can write a restarter for it while reusing SystemXVI's process management.

TI-RPC/SunRPC plays a big role here. It's much leaner than D-Bus and at least nominally capable of network transparency. Android has a portable implementation in ~5000 LoC.

Dependencies and ordering is actually calculated by a separate process called svcorder, influenced by BSD systems' rcorder(8).

There will be systemd unit file compatibility. Internally, SystemXVI has semantics similar to systemd units, but much simplified - no need for transactions or complicated job modes, since a repository is used instead to contain state.

More here: https://github.com/ServiceManager/ServiceManager/blob/master... (Flowchart: https://github.com/ServiceManager/ServiceManager/blob/master...)

It's a refined SMF-like system with influences from daemontools and BSD rc.d. Very microkernel-ish, it's meant to have strong module/communication boundaries and have its components be easily hotswapped or rewritten per a stable RPC interface.

It's very early and the project really shouldn't be getting such publicity, but a pre-alpha might be out soon, and it's a project well worth watching. I have confidence it will turn out well.


I think it's great to see someone writing a new manager that is inspired by SMF.

I would love to see a (series of?) blog post(s) about SMF. This wouldn't be a missive against systemd, but given that SMF has seen 10 years of production use, it would be a valuable history lesson. Why was SMF needed for Solaris, what was scope of responsibility, why were the abstractions picked, why were its boundaries where they were, and finally what has and hasn't worked for it?

It seems like the technical community would eat that kind of content up.


You can start by reading the 2005 Usenix paper which introduced SMF: https://www.usenix.org/legacy/event/lisa05/tech/full_papers/...

I'm fairly certain Oracle blogs about news regarding SMF with relative frequency.


Indeed, there are lot of good resources that came out when it was introduced. But that's not quite the same as someone being able to put the story into context of today's technology and status quo. Ideally the history lesson comes from someone with knowledge of SMF from many angles.

For instance Bryan Cantrill (full disclosure I worked for/with Bryan while at Joyent) was at Sun for the development of SMF, he's had experience using SMF in production, and more recently from his LX Brand work he's been exposed to the facilities that Linux has created to deal with the same sorts of issues.

Bryan is a superior technologist and a hell of a story teller, I'm sure I would enjoy reading/hearing his telling of why things like libproc and contracts were created and how they evolved over time, and if that as any relationship to why SMF is designed and behaves the way it does.

Note, Bryan is just an example -- there are many other people in the community that were there and could tell the stories. He's just the first person I thought of.

The point is that [re]creating interfaces/subsystems is ok, so long as we do that with the full view of history, such that we avoid that repeating cliche.

(I prefer run on sentences)


> even ending up being mocked on Linux Unplugged and hitting /r/linux as the project was still a skeleton.

To be entirely fair, I think there are few communities that can claim the honour of being worse and more amateurish than /r/Linux. It's incredible how many Linux experts who have used nothing but Arch Linux for about an year can gather in one single place. Every once in a while there's an OK post, but it's downvoted.

I hope your friend didn't take them too seriously.


/r/linux is schizophrenic. On the one hand, posts about new game releases for Steam are voted up and (Steam) gaming posts are quite active. On the other hand, they're badgering every single FOSS project which is trying to make money without begging their users for it (see the recent posts about Firefox). It's crazy.


Demonstrating the influx of web/devops, and how they are generally more interested in social media.


Someone had to say it :-).


Frankly i feel like a salty old git, even though i am of the same age as, say, Poettering.


What Linux communities would you recommend instead?


I have no idea what to recommend anymore. Most of my colleagues, myself included, have either moved from Linux to BSD or OS X land (depending on preferences) or just stopped being too involved. Helping people learn about software in general (and sometimes, but not necessarily, Linux in particular) used to be fun and rewarding. Nowadays, about 70% of the answers are either "Yeah, that's a bug they don't want to fix because they don't think it's a bug" or "I know that's not how you do it in Finder but that doesn't make it broken."

The Linux ecosystem is increasingly centralized and most of the organizations that are centralizing it, like the Gnome project, are actively discouraging contributions that don't match their agenda. This is reducing the importance -- and cohesion -- of any community. It's no coincidence that many people are finding less common ground in software they're all using than in software they're all hating.

The only Linux community I'm still vaguely in touch with (I read the mailing lists and sometimes the forums) is the Gentoo community. Largely because my only remaining Linux machine is running Gentoo. It eats a lot of my time, but stubborn USE flags control seems to be the only way to get a usable Linux distribution lately, and I'd rather spend an extra hour a week configuring it carefully enough to keep crap software out than spend a whole evening each month dealing with sudden breakage. It's not necessarily a friendly community, but most people seem to know what they're talking about.

Frankly, as bitterly as it may sound (and it's coming from someone who started using Linux just in time for the 2.0 to 2.2 switch), I'd rather recommend the OpenBSD community.


What about the ArchLinux community (IRC/forums/wiki etc)?

Arch has both made it easier for new Linux users to adopt it day-to-day while still appeasing the more advanced users, ala Gentoo. The Arch wiki has been the single best contribution to the Linux desktop world in recent memory IMO.

I'm curious how the more hardcore *nix users view it.

> I'd rather recommend the OpenBSD community.

My attempted to use OpenBSD and take part in the community left me disappointed. On the surface it seems to be exactly what I'm looking for (correctness + security). But I was turned off by the culty vibe of the users and it seemed like a backwaters in terms of software versions, as everything outside of the core seemed severely outdated.


I have... mixed feelings about Arch :-). It certainly doesn't appease me in any way; I don't use Gentoo because I like speed or especially DIY. I've done a lot of DIY crap on my Linux machines in the 90s and 00s. I don't enjoy doing it again. The only reason I use Gentoo is that, via the thickest web of USE flags I've ever woven, I can keep freedesktop.org's EverythingKit, most of Gnome's crap, pulseaudio and systemd out of my system. I can't really do that on Arch.

I think Arch is good in terms of "learning experience" for someone who wants to learn more about how a distribution is put together, but that's about it. I used it for a while, back in 2005 or so, but our later contacts have been... less than fortunate. I tried it again two or three years ago and my system regularly broke after every update. I eventually ragequit when I ended up with a system that wouldn't boot because, in The Great Move of Everything to /bin, it somehow managed to screw up the bootloader. I figured my data was going to be next, wondered what the fuck they were thinking even in the context of a rolling release, shrugged and wiped it out.

The Arch Wiki is a great repository of information though. It's even better than the Gentoo wiki was, back before it got (disastrously) wiped out.

> I'm curious how the more hardcore *nix users view it.

They probably think lowly of the whole Linux thing, but what do they know :-)

> But I was turned off by the culty vibe of the users and it seemed like a backwaters in terms of software versions, as everything outside of the core seemed severely outdated.

I found the culty vibe of the OpenBSD community to be far less obnoxious than that around... pretty much any major Linux distribution.

There are quite a few outdated packages in ports though, yes. That's often because of a lack of maintainers, but there are programs where that's just because of compatibility problems. There are very few broken packages, if any, in the ports tree -- and sometimes the broken status is very paranoidly assigned (i.e. I've had packages that were marked as broken but it turned out they ran just fine).


> I tried it again two or three years ago and my system regularly broke after every update.

Having used Arch for years and experienced the same thing years ago I've found that stability has greatly improved in the last year or two. I believe they were going through some major transitions back then and now have found a good stable foundation which they are now building on.

I haven't had it break in a very long time and having brought this up before on HN I've had many other people share a similar experience with Arch.

Although this is possibly due to the fact I've become significantly more experienced with Linux as a result of my professional work. Dedicating yourself to a single distro and becoming deeply familiar with it can be rewarding in that sense.


>TI-RPC/SunRPC plays a big role here. It's much leaner than D-Bus and at least nominally capable of network transparency

Isn't that true of D-Bus as well? I mean, maybe it's just semantics, but if you're going to use the word "nominally" (read as "it claims to be") when saying it's network transparent, D-Bus is certainly "nominally" network-transparent as that's an advertised feature of it.


Network transparency is most certainly not an advertised feature of D-Bus. There has been some experimentation, but nothing concrete has ever come out of it. Nor does anyone really use it as a remote protocol.

In contrast, SunRPC is RPC, so network transparency is a given. The reason I said "nominally" is because in the case of SystemXVI, having proper distributed service management (not a goal at this point) would require more than that.


> having proper distributed service management

Which is more or less what CoreOS's `fleet` already does on top of systemd. They used a different protocol for remote communications (etcd, ie. HTTP+JSON) vs. local, which I guess can be a sensible choice as requirements are quite different.


Yes, and they had to create an ad-hoc layer over systemd using a particular K-V store. Contrast to the more generic and seamless option of RPC interfaces, which are more versatile in making policy decisions.



Note the complete lack of security.

(A shared secret sent in plain text is not security).


I guess it's meant to go over TLS, just like HTTP basic auth.


I've never liked SMF's XML format, but it is by far one of my favourite process managers from my Solaris days (back when Sun still existed, since then I haven't touched it :-().


hanks for setting the record straight. I am glad you stepped up and explained. I always come to HN for expert insider comments, and people like you do not disappoint.


What is the significance of the name "System XVI" (S16)?


It's System V taken up 11.


Hear, hear, thanks for the clarification and info. Best of luck to David and tell him not to mind Muh SystemD Philosophy trolls ;)


Props to the author for posting the "testimonials." Got a chuckle out of it. Did not look at the code.


At a glance, the code is clean and idiomatic C written by an experienced professional programmer. Comments are sparse and describe intent. Variable and function names are consistent and well chosen.

I'd be happy giving it to a new programmer to use as a style guide. I have no idea if the architecture is suitable for the purpose, but anyone who is dismissing this based on code quality should be ignored.

Representative Sample: https://github.com/ServiceManager/ServiceManager/blob/master...


1. naming conventions are unusual and inconsistent (List_destroy but then there's destroy_svcs_list)

2. the return value of malloc() et al. isn't checked (http://www.etalabs.net/overcommit.html)

3. the register storage-class specifier is used multiple times (mainly lib/s16db/translate.c)

4. function declarations are missing a prototype (() instead of (void))


I'm not claiming the code is perfect. I'm claiming that at a glance it looks like professional code of a quality similar that of other high quality C codebases: Linux, Python, Perl, Apache, sqlite, etc. One can always do better.

I (and likely the author) disagree with you on the importance of checking the return value of malloc(). Because of overcommit (as you point out), a non-null value returned from malloc() does not mean that you will not crash when you access the pointer. If the pointer is used directly, crash on NULL might a reasonable approach. It's when NULL is retained and then used as the base of an array that it may become a security problem. Configuring malloc() to abort() on failure would be probably my preferred solution.

I agree with the last point, but think it's a minor one. While I'd like C to treat () in a function definition as equivalent to (void), for historical reasons it does not. The author trades off the visual noise of the word 'void' for better error reporting. But depending on your compiler, you may still get a warning. On the computer I just tried, 'icc' and 'clang' gave clear warnings but 'gcc' did not.

  nate@ubuntu:~/C$ cat void.c
  #include <stdio.h>
  int empty() {  return 3; }
  int main(void) { printf("%d\n", empty(4)); return 0; }

  nate@ubuntu:~/C$ icc -Wall -o void void.c
  void.c(3): warning #140: too many arguments in function call
    int main(void) { printf("%d\n", empty(4)); return 0; }
                                          ^
  nate@ubuntu:~/C$ clang -Wall -o void void.c
  void.c:3:40: warning: too many arguments in call to 'empty'
  int main(void) { printf("%d\n", empty(4)); return 0; }
                                  ~~~~~  ^


Not all systems have overcommit enabled. Notably, as the article linked above points out, robust systems do not have overcommit enabled. The OOM killer is heuristics based and cannot be relied on in robust systems. This means that libraries and applications that want to be viable on such platforms have to recognize that malloc may return NULL and respond accordingly.


Very true for low-level system libraries, but applications should really not waste energy on trying to be OOM-safe: it's hard to understand which actions are safe when you no longer can allocate memory (releasing resources often requires allocating memory temporarily), and it just adds complexity which basically will never get tested, thus getting broken quickly and gaining nothing over just crashing on a NULL dereference.

See the experience of the D-Bus (which tries hard to be OOM-safe) on this topic: http://blog.ometer.com/2008/02/04/out-of-memory-handling-d-b...


I agree with the comment about overusing macros though. Instead of:

    #define DbgEnteringState(x)                                                    \
        printf ("[%s] Unit entering state %s\n", unit->name, #x);
How about:

    static void dbg_entering_state(unit_t * unit, const char * state_name) {
        fprintf (stderr, "[%s] Unit entering state %s\n", unit->name, state_name);
    }
Friendlier to humans reading code, IDEs, and debuggers. I went ahead and fixed it writing to the wrong stdio stream too.


That's clearly trace code intended to be defined away by the build system. It's idiomatic to make such things macros. Also, your way forces unnecessary verbosity onto the caller; passing unit by name was not a typo.


    SetOrExit ("S16.Name");
Oh wait, SetOrExit is a macro ...

  #define SetOrExit(Name)                                                        \
      if (!prop_find_name (new_svc->properties, Name))                           \
      {                                                                          \
          OnError ("error: %s not set", #Name);                                  \
      }
Oh wait, OnError is also a macro

  #define OnError(...)                                                           \
      fprintf (stderr, __VA_ARGS__);                                             \
      goto on_error;
There's a goto hidden in a doubly-nested macro? Sorry, that is not good practice.


It's C style exception handling. Both the two macros, the place they are used, and the jump target all exist within the same 80 line file. I don't even write C and it's pretty obvious what's going on here. I might choose somewhat different names for the macros, but I'm not going to argue too much about that, it's a slippery slope.


Here's an alternate method of C exception handling: https://github.com/andrewrk/libsoundio/blob/4ce3429bdd9b08a0...

It does not use macros or goto and I find that it makes the code easy to reason locally about.

In summary:

    struct RefreshDevices {
        // fields that require clean up
    }

    static void deinit_refresh_devices(struct RefreshDevices *rd) {
        // clean up rd regardless of what state it's in
    }

    static int refresh_devices(struct SoundIoPrivate *si) {
        struct RefreshDevices rd = {0};

        if (error_occurred) {
            deinit_refresh_devices(&rd);
            return ErrorCode;
        }

        // ...

        deinit_refresh_devices(&rd);
        return 0;
    }
Whenever the ownership of a resource changes to something other than the stack of refresh_devices, you just null it out and it won't get cleaned up.

If calling deinit on every exit path of the function is too much typing you can wrap that function in another function that calls deinit, and the function doing the actual work can just return an error code without worrying about cleanup.


Exceptions as a programming language concept often include a change in control flow. While your method does "handle and exceptional situation", I don't think it could be said to be a general purpose way of doing exception handling in C, if you want to support the concept of changing control flow. While I recognize that you yourself may or may not have valid arguments about whether altering the control flow is a good idea, I don't think that affects whether this works as a solution for exception handling, in the general, PL concept sense.


I don't follow. You are saying that returning from a function early is not "changing the control flow"? What exactly is your criteria for "solution to exception handling"?


You are testing and returning, and responsible for propagating the error in your example, and have a chance to forget to do that. That is what IMO keeps this from being a general purpose exception. That you could correctly check the error return in one subroutine, but forget to in it's parent is the issue. Each approach has it's benefits and drawbacks. Automatically jumping to a cleanup routine means you can't forget to handle it, but also obfuscates control flow. In any case, I view it as an integral feature of exception handling.


Believe me, I know what it is. This just isn't the way to do it. Hiding a goto in a macro is bad. Hiding a goto in a nested macro is even worse. Sure, it's fine right now, but this is setting the project up for maintenance nightmares.

What's the benefit of this way over a more explicit coding style? The only benefit is saving a few lines of typing. Code is typed once and read 1000's of times. It's better to optimize code for reading, not for writing.

Feel free to read the Joint Strike Fighter (JSF) coding standards which says "Goto shall not be used", and "Macros shall not be used, inline functions are preferred". Also, see NASA's Joint Propulsion Labs (JPL) coding standards where they recommend against using goto, and only allow simple macros (hint: a goto inside a macro is not simple).

Some really smart people put a lot of effort into creating coding standards for critical systems. Even if you don't believe me, when Bjarne Stroustrup is hosting the coding standard on his personal website you might want to pay attention.

http://www.stroustrup.com/JSF-AV-rules.pdf

http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf


I also used to exhaustively follow the no goto rule. Not that these days I have much reason to mostly coding in java. Goto is definitely very easy to use but there are a lot of scenarios where the code that you write trying not to use goto can be much more butprone and complicated to understand. Not everyone is sort of semi-infinite budget for their project though I agree we probably need to do a better job in maturing in discipline.

On the other hand no other engineering will be asked to change the requirement and spec hundreds of time during its lifetime either since that would be impossible to achieve.


Just as a side note, have you looked at the systemd source code? It has many, many with gotos (and sprintfs and hard coded buffer sizes and... and...)


No, I haven't looked at systemd source code. Does it have gotos hidden in macros, too? There are many practices that used to be common that we have learned should not be common. Hopefully a new project doesn't repeat the mistakes of the past.

From earlier this year, NASA's 10 rules for safety critical systems [1]:

  Rule 1: Restrict all code to very simple control flow constructs. 
  Do not use GOTO statements, setjmp or longjmp constructs, or 
  direct or indirect recursion.
I don't think gotos are inherently bad, but I choose to heed the advice of people with more experience than me. (I do think that hiding gotos in macros is inherently bad, though.)

[1] http://sdtimes.com/nasas-10-rules-developing-safety-critical...


Just as a side note to side note (preemptive strike against accusations of systemd for using gotos) — the Linux kernel has many of them either. This is idiomatic C-style error handling.

Also note that systemd employs gcc's __attribute__((cleanup(...))) logic to improve code clarity.


4/10 needs more RAII :P


Darn, it looks like I'm going to have to do an actual writeup on http://pid1.wikia.com/ now for System XVI...


Right, yeah, well, now I'm not sure if this is a joke/troll project.


Its a real project. And there is a few other projects OpenRC etc. The longevity of Unix over the years has been its highly modular architecture. You can simply rip a daemon out and replace it with a new one. I don't think any attempt to create a monoculture will win favour in the long run. Even if Linux is utterly consumed the result would be a shift to Bsd.


the use of macro is a little scary thou and not exactly typical C stuff. interesting to see where this will go.. also, the name is cool but XIV would fit the logo better than XVI ;)


A 'replacement' for systemd should be designed to be drop-in: /sbin/init should read unit files (which in general are much nicer than init scripts), should provide systemctl with the same options, and if dbus is also undesirable (which most people who burn with fire-y rage for systemd do), should replace the communication between processes like systemctl and /sbin/init with something else, like a Fuse or 9p filesystem.

That would be pretty neat. But until then, the world will continue to run on systemd, because it _works_.


/sbin/init shouldn't read files. A dedicated configuration parser should, and so it is done here.

SystemXVI will have systemd unit file compatibility. SystemXVI natively uses an INI-like file format itself.

SystemXVI uses SunRPC for communication. Writing a systemctl wrapper could be as trivial as a shell script.

But until then, the world will continue to run on systemd, because it _works_

The world doesn't run on systemd, and it isn't a "just works" thing in the slightest.


"The world doesn't run on systemd, and it isn't a "just works" thing in the slightest."

Redhat are betting their particular farm on systemd in RHEL7, and Debian (hence recent Ubuntu releases building to the 16.04 LTS) has adopted the suite.

Is the point you are making above based on the percentage of servers running earlier supported Linux implementations or on the existence of alternatives such as the BSDs?


> Redhat are betting their particular farm on systemd in RHEL7

It's basically their product, so...

> and Debian (hence recent Ubuntu releases building to the 16.04 LTS) has adopted the suite.

I think Ubuntu adopting systemd has less to do with Debian doing so and more to do with Gnome becoming increasingly difficult to package without systemd.


Betting the farm? That seems a bit dramatic. I'm quite confident that Redhat could remove systemd just as easily as they added it if they had reason to.


Not in RHEL7, which they will support at least until June 30, 2024. Next RHEL version they could of course switch it out. They are also supporting Upstart in RHEL6 for a long time still, though its mostly a glorified sysvinit wrapper there.


> the world will continue to run on systemd, because it _works_.

Does it?

When Debian switched to systemd my system stopped booting. When I managed to "fix" it, it worked superficially but displayed "Segmentation fault" at every boot [it still does that on another one of my machines, which in fairness never "broke".]. This in addition to boot-time fsck breaking and my bluetooth audio and keyboard setup no longer working. Based on this I am not inclined to think of Mr. Poettering as an "it just works" kinda guy. On a related note I have never successfully gotten pulse audio to perform its intended function of playing audio samples.

After ~15 years of mostly tracking Debian sid on my personal machine, I switched to FreeBSD on my laptop when seeing these issues. No major issues since.


Are you sure these are systemd problems? I'm not a huge fan of systemd, but I haven't actually had any problems with it (apart from not liking how it works ;-) ). For example, bluez breaks in every update in my experience. I've found a dance containing magical incantations that will usually keep it up for a few hours, but I still hold on to 2 or 3 old versions of the software just in case the new version is completely broken (which has happened to me on more than one occasion).

I moved from Debian based distros to Arch a few years ago and found that my machines are a lot more stable. Because it you do the configuration yourself (rather than the configuration being in the package a la Debian), you are a lot more free to jettison dependencies of questionable quality. Though, I had a lot of trepidation when Arch switched to systemd, it has not yet broken my system in any way.

Still, there's nothing wrong with FreeBSD, so if it's working for you, might as will stay with it.


I'm running Gentoo Linux and haven't had issues with bluez. shrug


My bluez problem may be related to using a Microsoft Sculpt keyboard. Was running Gentoo for many years on one of my boxes with the same bluetooth issues. Mainly bluez works fine if the device doesn't shut down for power saving. If it does, then bluez will often refuse to reconnect to it. I've tried to debug it but the code is not the most friendly ;-) It seems that the agent gets in a bad state where it thinks the device is both simultaneously connected and not connected. The easiest way to solve the problem is to restart the bluetooth service, reinitialize the agent and then hit a key on the keyboard. Doing anything else will cause it to get in the same bad state again. Occasionally you have to remove the device and re-pair.

Every third update it will just start working and I'll think "Oh they fixed the bug". Then the next update breaks it again. I don't doubt that bluez must work well for some hardware, but for mine it is practically black magic to get it working.

My only point in fingering bluez is to say that bluetooth going down is not a surprising event on my machine and isn't related to systemd at all. I suspect the same is true of the OP's box. It was probably just a coincidence that bluetooth died at the same time that systemd was installed.


I am guessing the bluetooth radio was connected via USB. Thing about USB is that the Linux drivers were built around a misunderstanding regarding the sleep/suspend of USB devices.

While the Linux code assumed the hardware was required to be up and ready in 2 minutes or less, the 2 minutes mentioned in the spec was just a suggested minimum wait time. Thus it would give up on devices that was still in the process of powering back on.

That not to say that bluez have been in great shape in recent years. But then it is also a complicated concept for *nix, as it straddles root and user. You have their whole pairing interface on the user side, and then all the device nodes (HID, Audio, etc) on the root side.


We use FreeBSD on a few 1k's of servers in production, where nearly all of our critical processes are supervised by daemontools.


The world doesn't run on systemd on FreeBSD...


"System XVI has received unexpected publicity. Several commentators have written these glowing endorsements of System XVI:

This is probably the best example I've seen on how NOT to program in C. Unneeded typedefs, random macros for simple logic. Do not use."

Its nice to see a sense of humor and some humility. There is a healthy ecosystem of Linux distributions that do not use systemd and have no intention of doing so. Slackware, Void, Puppy, Crux and many others.


How can people be that dismissive of typedefs and macros in C? IMO it simply follows the DRY rule to use these techniques wherever a certain structure is reused significantly.


It'd be awesome if this was BSD licensed, so that in the future when it is stable it could appear in FreeBSD as the replacement to it's current initd.


Not being BSD license I would expect it never to get beyond the ports tree.


Yup, that is my worry too.



It didn't at the time that X-Istence wrote that.


Both nosh and NextBSD launchd are BSD licensed.


I really like that it takes some things from SMF. I honestly don't think that porting all of Mach over just to get launchd is a great idea...


... still leaving you, after you subtracted SystemXVI because of the licence and launchd because of the XNU stuff that had to be ported to support it, with at least one BSD-licensed non-Mach replacement for rc.d already available.


Except that due to it being of the daemontools family, it doesn't get the same sort of treatment as something like SystemXVI or SystemD. Sadly.


Looking at some of the treatment that SystemXVI has received so far, I'm not sure that it's entirely a sad thing. (-:

In happy news, nosh has a picture and a Burns allusion:

* http://homepage.ntlworld.com./jonathan.deboynepollard/Softwa...

* http://homepage.ntlworld.com./jonathan.deboynepollard/Softwa...


Its only the IPC layer...


... and it is thus in the same boat as systemd.

The IPC layer for getting launchd to work involves amongst other things FreeBSD kernel changes to support the Mach IPC API; similarly to how the systemd people have already achieved (subreapers) and have been pushing for further (kernel Desktop Bus) kernel changes for systemd's benefit; and indeed much like how IBM AIX gained kernel augmentations for supporting the System Resource Controller: the SIGFORCE and SIGNORM signals and the SRC_kex.ext extension.

vezzy-fnord is I suspect refraining from pointing to http://blog.darknedgy.net/technology/2015/08/26/0/ (https://news.ycombinator.com/item?id=10127120). (-:


It will be interesting to see which takes off, a totally different style, or a reductionist fork?

http://uselessd.darknedgy.net/

I am a systemd user and Arch Linux fan (but definitely wanted to get to Gentoo to have the most choice), but I have been hoping to see news of uselessd pop-up again.

I think the reaction to systemd is quite upsetting, for its tone and not its message about choice which is core to our support of open source values, but I welcome the alternative solutions and approaches and I am very curious to see what percolates to the top.

Personally, as a budding Lisp geek, I wait for dmd.

https://www.gnu.org/software/dmd/

Although, and I know I have seen him post here before, the vitriolic systemd hate on one of his blog posts make me worry about reactionary development, just like this and other projects.

I have heard a lot of good about SMF though, so I will need to follow all of them now.


I'm the uselessd author. It's dead, as I made clear at the top of the wiki index.

There will be a successor, though I wasn't expecting SystemXVI to arrive. I might just rethink it to be a layer on top of SystemXVI RPC, I don't know yet.


I'm curious to know what it is that you prefer about the design of System XVI over other alternative init/service management systems. In particular, compared to s6 [1], which has similar design goals to System XVI but appears to be much closer to completion.

[1] http://www.skarnet.org/software/s6/


Well, I look like an idiot. I guess I should moved beyond the landing page (if that is not the wiki index page).

Well, either way, I wish you luck. You Github testimonials at the bottom of the SystemXVIREADME were very entertaining. I am glad you have thick skin. You will need it, and I cannot wait to see real competitors in this space beyond SysVinit, runit, and of course the dreaded systemd.


It's systemd, not SystemD.

Also, this is not yet a replacement for systemd:

  #include <stdio.h>
  
  int main (int argc, char * argv[])
  {
      printf ("defer work on the pid 0 until we have a functioning "
              "service manager\n");
      return 0;
  }
(I think they mean "pid 1")


No doubt that "System D" (500) is a play on words against "sys V init" (5)... "100 times better"!?


I think it's just a system daemon.


He might be referencing systemd's site [1]:

>But then again, if all that appears too simple to you, call it (but never spell it!) System Five Hundred since D is the roman numeral for 500 (this also clarifies the relation to System V, right?)

[1] http://www.freedesktop.org/wiki/Software/systemd/


They could be talking about pid 0 : https://en.wikipedia.org/wiki/Process_identifier (See swapper or schedule)


You're right for the name. Thank you.


Can we please stop making Linux as a thing about choice[1] and stop fragmenting the community in the name of following a UNIX philosophy, the rule of KISS or some other nonsense? Are we really going to spend lots of energy the next n years discussing the best init system? Isn't this a pretty simple and solved problem by now?

[1] http://islinuxaboutchoice.com/


Linux is just a kernel. It always has been and always will be. It is pretty useless without userspace. That's why we have distros. As long as multiple distros exist, they need some way or other to differentiate themselves. Differentiation in distros naturally presents us with choice. Linux has always been and will always be about choice. Any attempt to converge the userspace does not change that.

If you want an operating system where everything is developed in tandem, you might want to consider switching to one of the BSDs.


UNIX philosophy, the rule of KISS or some other nonsense

How are those things nonsense?

Are we really going to spend lots of energy the next n years discussing the best init system?

Why the hell shouldn't we?

Isn't this a pretty simple and solved problem by now?

Nope. It's actually a very hard problem with little in the way of complete solutions. Take your ignorant trolling elsewhere, please.


It seems to me there only needs to be one init system, otherwise it is a fragmented mess. Windows is gaining fast on Linux in the cloud/server space (SSH support, docker/container support, "Nano Server", more open source components).

Windows has no fragmentation issues in the core system.

Whether it is Systemd or System XVI im not really that bothered. Something needs to be decided fast however, and Systemd has a big head start.

I just hope it dosen't end up like the KDE/Gnome mess.

One kernel. One init. One desktop.


The mere existence of alternatives doesn't suddenly transform you from "unified utopia" to "fragmented dystopian hellhole".

Windows isn't gaining anything on the demographics that use GNU/Linux for cloud deployments. You're reading too many articles on HN and extrapolating that Windows must be curb stomping its competition because of some smart business decisions on part of Nadella and co. You're further making the assumption that Linux needs to compete with Windows. It doesn't.

It's not that there is fragmentation, so much as the problem space is quite open-ended and there are multiple solutions. Forcing a square peg into a round hole (systemd ueber alles) is a recipe for impedance mismatch and stagnation, not unification.

One kernel. One init. One desktop.

Ein Volk. Ein Reich. Ein Fuehrer.


I don't think anyone wants to go through this again.


The problem isn't the argument, the problem is systems that aren't really wanted being rammed through by political means.


You could always start your own distro.


Elaborate?


He's referring to the fact that systemd was (is) very controversial when it was announced as a replacement for SystemV init. There were _lots_ of flamewars and hyperbole and when Debian announced they were making systemd their only init, it sparked a fork: http://debianfork.org/

It seems like a lot of people are calming down about the issue now.. so lets not do all that again


I think Linus Torvalds coming out in support of systemd helped a lot (or at least confused the UNIX purists enough to calm them down a little). Linus response from [0]:

> I have to say, I don't really get the hatred of systemd. I think it improves a lot on the state of init, and no, I don't see myself getting into that whole area.

> Yeah, it may have a few odd corners here and there, and I'm sure you'll find things to despise. That happens in every project. I'm not a huge fan of the binary logging, for example. But that's just an example. I much prefer systemd's infrastructure for starting services over traditional init, and I think that's a much bigger design decision.

> Yeah, I've had some personality issues with some of the maintainers, but that's about how you handle bug reports and accept blame (or not) for when things go wrong. If people thought that meant that I dislike systemd, I will have to disappoint you guys.

[0] http://linux.slashdot.org/story/15/06/30/0058243/interviews-...


I'm not sure that Torvalds says that he likes systemd. I read the quote as saying: "I don't dislike systemd, and I think that it is better than traditional init.". Almost anything [0] is better than SysV init.

shrug

[0] Upstart, OpenRC, daemontools...


Strictly speaking, of those three only upstart is actually a replacement for System 5 init. OpenRC and daemontools run under an init, and by and large replace System 5 rc. There are two toolsets (soon to be three) in the daemontools family that provide a replacement for init: nosh and runit. daemontools itself does not, and has had more than a decade of people adapting it to run under init replacements because the vanilla system only targets late 1990s inits, including under upstart as a matter of fact.

* http://homepage.ntlworld.com./jonathan.deboynepollard/FGA/da...

* http://homepage.ntlworld.com./jonathan.deboynepollard/FGA/in...

* http://homepage.ntlworld.com./jonathan.deboynepollard/FGA/sy...


So, strictly speaking, is SysV init more than what is described in init(8)? That is to say, is it more than

* the parent of all processes, including orphans

* the executer of various entries in /etc/inittab, depending on runlevel

* the configurer of local and serial consoles

* the handler of CTRL+ALT+DEL

?



From one pedant to another, thanks for the correction, and the informative links!

So, I guess that the Debian folks were incorrectly claiming that they were looking for a replacement init. They were looking for a replacement rc system, and were willing to accept a replacement init as part of the package.


Never mind that the term systemd can be used both for the init binary, and the larger project (systemd, networkd, udev, journald, logind, and the list goes on and on).

As best i could tell, Torvalds didn't have an issue with systemd as the init.

Nor do i think many would (outside of the whole journald thing), except for the snowballing of sub-daemons and absorbed projects (udev for instance existed for a decade as an independent project before being folded into the systemd code tree).

Without all that, it may well be that systemd would just be one init among many (i barely noticed the existence of upstart for instance).


I wasn't born yesterday, I know that. In fact, I helped fan the flames by making uselessd, though not out of any sly intent in the least, but research curiosity.

My question was more about the GP elaborating why a new system would be something "[no one] wants to go through this again."

(I'd characterize systemd as being a sysvinit replacement to be a great misnomer. It's not. It's an entire framework for providing certain low-level userspace daemons, utilities and auxiliaries meant as a common middleware for GNU/Linux distributions. Even systemd-the-PID1 cannot be called a sysvinit replacement in any integrity.)


>My question was more about the GP elaborating why a new system would be something "[no one] wants to go through this again."

Because the migration was terrible that is why. And I still continue to find stuff I hate about systemd. The latest was just yesterday, while troubleshooting some disk issues I wanted to fsck root before rebooting. Which normally means just remount root as read-only. Tried that and it didn't work, "disk was busy". Jumped into runlevel 1, still disk was busy. Rebooted into rescue mode, remount ro, still disk busy. After googling and digging through lsof, I could see it was systemd processes and journal that was holding onto files and preventing a remount. Tried killing systemd, journald, even kill -9, it just restarts it self and holds the disk. Darn. Did the touch /forcefsck trick, rebooted, yet no fsck. Darn again. More googling later, I see you have pass kernel parameters at boot time to force a fsck, and no, you can't select which FS either, it just does all of them. Rebooted again, fsck is running of course. I just wanted root checked but now it is doing all my terabyte partitions as well. Gave up at this point and just left the machine and hoped for the best.


I hate it too, but workaround is not too hard:

  chmod a-x journald && pkill journald
of course, it is bad idea to touch damaged FS before fsck.


Wait.

To run an unscheduled fsck of your root fs (and only your root fs), you have to remove the execute bit from a systemd component, kill that component, then remember to reset the bit after you're done?

That's nuts.


That's nothing - systemd panics (pid1 crash) if you restart dbus, which requires either the Magic SysRq or the reset button..

https://forums.gentoo.org/viewtopic-p-7811550.html#7811550

Pottering was apparently told about this in 2013, when he simply declared it expected behavior because dbus was "rock solid".

http://lists.freedesktop.org/archives/systemd-devel/2013-Jan...

Incidentally, this might be the original reason for kdbus - it's harder to restart the daemon after it becomes a kernel module.


> Incidentally, this might be the original reason for kdbus...

Yeah, especially given that its stated reason (Moving into the kernel gives us a 2x speedup over userspace dbus! Dbus is slow and we NEEED the speedup!) is mooted by the 10x speedup that cleaning up the userspace dbus libraries has achieved. [0]

Having said that, in the systemd-devel post Poettering says that they're working on...

> ...improving this in two ways: even if it still brings down the system, at least allow logind to handle this case nicely, so that you get a sane getty. [The other way is to push dbus into the kernel.]

Systemd is a sprawling project, so I could see why it has taken more than two-and-a-half years to fix a local DOS caused by the perfectly normal method of responding to a DBUS upgrade. ;)

[0] Fun fact: Those cleaned up libraries were supposed to be released after kdbus was merged into the kernel in 4.1. The Systemd Cabal was so disappointed that they had to ship the performant userspace dbus libs before the dbus daemon got pushed into kernelspace.


I am by no means knowledgable in the relevant domain yet this is scary at so many levels. The team that is taking charge of so many critical infrastructure not doing a minimal level of fact finding before pushing for a new kernel module for RPC is a bit obscene.

I may have read things that are biased but I have a feeling it is a useful project and it does a lot of thing elegantly or right. But the project acts like it is the center of the universe and all should act to meet its needs. I have seen such project/modules at work, they usually makes everyone happy to start with and down the road when they finally get stuck and I am forced to consult we are already long way down the slope.


Goddamn, guys, stop repeating the same absurd.

kdbus is NOT ONLY about performance. It's about the lifetimes, the availiability in the early userspace, the new marshalling format, the new user bus concept, and so on, and so forth. It's about separating the transport (in the kernel) and the policy (in the userspace).

Most of these points can be achieved without putting the transport in the kernel, but some can not.


Yeah right. No wonder I see people adding all sorts of adjective to the systemd proponent. I don't usually go hunting for sources and I only had Linus's comment about kdbus and performance. So I went to look for pottering's own word and here is a verbatime quote from the abstract of his presentation on kdbus

  D-Bus is a powerful design. However, being mostly a userspace solution its latency and throughput are not ideal. A full transaction consisting of method call and method reply requires 10 (!) copy operations for the messages passed. It is only useful for transfer of control messages, and not capable of streaming larger amounts of data.
So yeah he tried to sell performance as the big thing for his version of dbus IPC. I'm guessing he (and group) spent more time with dbus than linus and failed to find the actual cause of the performance issue or tried to hide it give more credence to kdbus. In both cases it makes it that much harder to trust their reasoning and conclusions.

I never said kdbus is ONLY about performance. But from what I'm reading they are not performance people but they think they are. The most dangerous people are those who don't know their own limitation or weakness. And your comment simply emphasise the practice of listening whats being said. So yeah given that I don't know dbus or ipc I have to trust your logic and you only make it hard to do so.


Seems the quote gets turned into a single long line in Firefox, and subsequently making the page extra wide...


Strange. The quote is properly contained over here.

FF 40.0.3 on 32-bit Linux.


If you haven't, you should go back and read all of the LKML discussion surrounding the request to merge kdbus into Linux 4.1. Perhaps I misunderstood something, but I got the distinct impression that several of the folks reviewing the code were concerned that kdbus pushes too much policy into the kernel.

LWN article: https://lwn.net/Articles/641275/

LKML thread: http://thread.gmane.org/gmane.linux.kernel/1930358

> It's about the lifetimes, the availiability in the early userspace...

Starting dbus in your initrd [0] makes it available to the system from before the first second your service start machinery is started. If you want dbus available to userspace in your initrd, make starting dbus the first thing you do after you load your initrd. Is there some particular reason why this doesn't meet the lifetime and availability requirements?

[0] Remember that the initrd is responsible for things like decrypting the rootfs. When you're in the initrd, your real system is often not even remotely ready to run.


That kind of thing is why I stick with Slackware and OpenBSD for my production systems. systemd has some great ideas, but it has some down right silliness as well, and I personally don't feel it's stable enough for daily use yet.


(Just to make sure everyone gets it)

It will prevent it from running again. So it is basically a big FU! to systemd.

"So you you think you gonna restart yourself? Oh, no you won't"


As mentioned above, that doesn't seem needed [ed: nm, it is - if journald is set to log somewhere other than /run that's usually mounted as a tmpfs].

[S]urely, something like:

  mount -t tmpfs /run && pkill journald
[ed: that is, for logs on /var/log: "mount -t tmpfs /var/log && pkill journald"] would make more sense? (In my experience systemd itself does something funky on / -- so the above wouldn't be enough -- but still seem a little more sane than running a chmod on a file (presumably) residing in an fs you want to fsck).


Actually, the workaround is to reboot with kernel param "emergency". It's single user mode with the root helpfully already in read-only mode.


The OS is Debian 8. I seemed to think the same too, but as I mentioned, rebooting into single user mode did no such thing.


You mentioned booting into rescue mode. Rescue mode is not emergency mode. Have a nearly two decades old explanation that still holds true for systems like nosh and systemd today:

> If you need to check the root file system, you can reboot into single-user mode with the root partition mounted read-only by issuing the -b switch at the LILO prompt. The -b switch will be passed through LILO to init and will cause an emergency boot ...

-- David A. Bandel (1997-01-01). Disk Maintenance under Linux. Linux Journal. http://linuxjournal.com/article/193


That points towards either Debian implementing systemd poorly in this aspect, Debian poorly documenting well how to accomplish this after the change, or a failure on the user's part to find said documentation.

Systemd on Fedora supports this perfectly fine, so I can't really call this a failure of systemd itself.


Maybe the problem is that you're carrying conventions from an obsolete system to a new one? A quick google search for how to force fsck with systemd comes up with results[1] mentioning options you haven't covered. Then again, I haven't had to do this, so maybe you did try those steps and it didn't work.

1: http://lmgtfy.com/?q=systemd+force+fsck


As far as I can see, they all mention "fsck.mode=force, as a kernel parameter", which is what I was complaining about.


Here's the TL;DR

1) It's actually super easy to get to a read-only root mode.

2) It actually is a problem of carrying over obsolete conventions from prior init systems.

3) It's not your fault because systemd has extremely poor documentation on this AFAICT, to the point that people searching for this problem can't even get answers from others that know how to do it, or at least those answers don't rate very high.

4) Just boot with kernel param "emergency" instead of "single" (or "rescue" which is an alias for "single") and you will get a single user mode with root already mounted read-only for you.

And now for the longer rambling version.

Okay, so I spent a little time on this since I happen to have a virt for Fedora 20, and I finally found a couple of solutions that work. First, the systemd solution, which is emergency mode. It's correct that the journal will cause problems here, but that's because the journal was already running and apparently systemd doesn't want to, or is unable to stop it. The solution is to reboot into emergency mode. This can be accomplished by using the kernel param "systemd.unit=emergency.target", or just the param "emergency". This will not only boot you into a single user mode, but will default to having root mounted read-only.

To clarify, this appears to be a new mode in addition to single user mode. Single user mode can still be reached with the kernel param "single" or "rescue" or "systemd.unit=rescue.target".

I finally stumbled on this when I decided to look at the directions for what I wanted to accomplish for the distro I was using[1] and not for systemd specifically. Which makes sense, but I think we are all to stuck on "systemd is different, how do I do this in systemd" rather than "how does my distro say I should accomplish this now, given the changes they've instituted."

1: https://docs.fedoraproject.org/en-US/Fedora/18/html/Installa...


2) It actually is a problem of carrying over obsolete conventions from prior init systems.

It actually is not. It's a problem of not knowing the conventions from prior init systems. emergency mode did not originate with systemd. It has been around since 1995-12-03 (Miquel van Smoorenburg's System 5 init clone, version 2.57d).


It was possible to boot to single user mode and remount root as read-only prior to systemd, Whether this was the optimal method for achieving this is irrelevant, it was possible, and popular. That specific method no longer works easily as before, while the optimal method does. The commentor was using their preferred (if suboptimal) convention from the prior init system and running into problems. I think that perfectly matches the statement "It actually is a problem of carrying over obsolete conventions from prior init systems."

It is also a case of not knowing the correct/optimal prior convention, but that doesn't make the original statement untrue.


This is a job for a distro like System Rescue CD. Boot off CD, run in DRAM, figure out which partition or LV is root, and directly fsck it.


[ed2: Never mind - the standard setup is to have /run mounted as a tmpfs -- I failed to check that before this song and dance]

Which distribution was this? I just tried on my laptop (running Debian 8, fs on top of lvm on top of LUKS), and a simple:

  telinit 1
  # log in on console
  mount -oro,remount /
worked as expected.

[ed: Actually just tried forcing an fsck as well, and worked without a hitch. Just remember to "mount -oremount /" before running a "telinit 5" (Nothing really bad happens if not, but the system(d) will be confused if rootfs is mounted read-only).]


esaym probably doesn't need people to try and diagnose xyr problem, and this isn't really the best venue to do that. If you want to try this, try at least replicating esaym's setup as described rather than using your own and declaring "works for me!". One way that springs to mind is moving the journal into /var/log and having no separate /var/log volume; which reproduces the described setup of open journal files on the root filesystem. Do that and then try to force an explicit check of the root volume.

Of course, knowing that does point to one obvious way to avoid the stated problem: move the journal back into /run again (per the journald.conf manual page) and simply restart journald. No mucking about with permissions required. But that's not the only thing that xe could have tried. emergency mode, for example, is documented as not mounting additional non-API filesystems at all, nor read-write remounting the root volume. See http://freedesktop.org/wiki/Software/systemd/Debugging/#boot...


Hm, good catch. I thought I checked to see if /run was mounted before going through the whole exercise, but apparently I didn't (My first test was to "mount -t tmpfs /run" -- to shadow the /run mountpoint -- but I didn't realize this already existed as a tmpfs).

[ed: Given that it's not wise to write to a fs of questionable state, editing /etc/systemd/journal.conf is out. So it would appear shadowing the sub-tree in question with a tmpfs mount would be the sanest approach?]

[ed2: Just checked that if setting "Storage=persistent" in journald.conf, forcing logging under /var/log (and with /var/log as part of the rootfs) at least here, shadowing /var/log with a tmpfs works, in order to remount rootfs read-only in "runlevel 1" (actually systemctl rescue, when using systemd). Not sure if there's any elegant way to unshadow/unmount /var/log w/o a reboot though.]


+1 seriously


I found this the funniest part:

-----

/u/TheReverend403, reddit. [...] the world doesn't like goto. First of all, it can cause all kinds of memory leaks.

Your program is simply an attacker's target.

As soon as a distro maintainer sees that goto, it'll be rejected.

-----

Clearly, TheReverend403 hasn't read any systemd source code.

systemd$ rgrep goto . | wc -l

2690


Nothing written in C, or at least entirely in C, is a viable replacement for systemd.

One big problem with systemd is the internals: all the stinky "char *" string processing which is repeated all over the place and the badly greenspunned OO programming system that it contains and so forth.

The kinds of things that systemd does are best written in a high level language (of course, with suitable bindings for the system calls required).

Old SysVInit gets away with being written in C because it's small and simple and just handles the /etc/inittab, which has a simple format. Anything complicated is farmed out startup, shutdown and runlevel scripts.

The way to proceed was to add an extension language to init so that actions can be written in that language to do more kinds of things.

Perhaps the executable which runs as PID 1 can be a very tiny "init kernel" written in C, which intrisically does some of the things that only it can (like reap processes parented to PID 1), and farms out everything else to a larger daemon. This would be for the sake of fault isolation; we don't want PID 1 to crash. The risk of that is high if you make it a large and complicated program, with a large and complicated run-time infrastructure.


I get the feeling that the wheel is being re-invited here again.


It really isn't. SystemXVI is relatively novel as far as init system designs go. It's a more modular and pluggable Solaris SMF-like design (no XML configuration, don't panic), designed for portability at that.


Point taken, it would be nice if someone could start a comparison table between the different init systems, perhaps on Wikipedia? https://en.m.wikipedia.org/wiki/Init



We've also reinvented a car to go with our reinvented wheel, but we've also made is so the car is useless unless you use our wheels, and by virtue of having no other function, the wheel is also pretty useless unless you use it with our car.

But don't fret. Our car will provide you with it's own desktop built into the dashboard. Our car can communicate other cars, but only those of the same model, and even if our car doesn't have support for something you need yet, don't worry, as we'll build it directly into your car soon!


I can think of at least five times in the 20th century alone that the wheel was successfully and usefully reinvented.


>systemd is tightly integrated in a way that makes pluggable replacements difficult.

systemd "aims to unify pointless differences between distributions", so nothing needs replacing




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: