Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Lots of split brains. Serious bugs making it through the alpha and beta channels into stable (and our boxes auto-updating only to become useless). Fleet units dying purely due to problems with fleetd/systemd. A particularly painful one was an Akka deployment on top of CoreOS where a sidekick unit would fail to start because fleet hadn't actually copied the unit file to the remote host. Only happened with sidekicks but due to how we ran our networking, it effectively killed the application. Almost every redeploy required manually getting fleet to copy the unit over.


Just to add on: I've had fleet misreport unit status and btrfs reporting lack of disk space for no apparent reason. Also the inability to restart individual failed units which are part of a global unit.

Also there was that one time they changed how cloud-config was parsed, so if "#cloud-config" wasn't on the very first line without preceeding spaces, initialisation would fail. That was when I switched the reboot strategy to manual.


Btrfs is no longer the default for CoreOS for this reason. Overlayfs doesn't have this issue.


Oh man, yes. I'd blocked all my scaring memories of btrfs biting me in the ass.


Matches up pretty well with my experience, too. I do not trust fleet as far as I can throw it.


Yeah, the whole project was something of a disaster. Eventually things stabilized a bit but every few weeks etcd or fleetd would throw a curveball and I'd lose a day of time chasing down the problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: