We're 16 years away from the overflow. In other words, some of the systems being put in production right now are going to be around, untouched, unchanged, when it happens.
Given this article, it seems we're still doing it wrong, and that means... 2038 will be "fun" (either the remediation, or the consequences of the lack thereof).
At least most of us in the industry now have a good retirement plan: Fixing the legacy systems in 16 years...
We're 16 years away from the overflow. In other words, some of the systems being put in production right now are going to be around, untouched, unchanged, when it happens.
Yes. It's getting close. Windows XP, released in 2001, still had a significant presence in 2017, 16 years after introduction.[1] 42% of companies still had some XP systems running back then. 11% of machines were still running it.
TFA is about how GNU/Linux systems being deployed right now are not Y2038-safe.
Windows applications have been safe from this issue by default starting from VC8.0 (VS 2005), macOS users since 10.7.
And to be fair Linux itself has used 64b time_t on 64b machines from the start (and since 5.6 — last year — has also migrated i386 to 64b time_t[0], though a few issues will remain forever).
There are many who don’t think like this. Microsoft has dedicated teams that ensure backwards compatibility for legacy applications, and even then people stay on XP. If you have an inherited Ubuntu system that has been happily running a proprietary binary for years, there’s even less guarantee (and certainly no marketing organization making that guarantee) that an upgrade won’t break things. And so you fall far behind LTS.
Ironic that so many people are hoping to retire before they get asked to deal with this mess since so much medical equipment is built on ancient and poorly maintained microcontroller environments.
> In other words, some of the systems being put in production right now are going to be around, untouched, unchanged, when it happens.
Most of these systems being put in production right now will be using 64-bit distributions, which have always used 64-bit time_t. Most of the rest will be using embedded distributions, which often use something other than glibc. So only a fraction of these systems "being put in production right now [which] are going to be around" will be affected.
I don't know of any embedded systems still using glibc.
Dietlibc, uclibc (used by openwrt and the other consumer router replacement firmwares), musl libc, and bionic (since Android is technically an embedded Linux, so should be mentioned as well) dominate the embedded Linux space.
I do embedded development and I use Glibc on all of our products. It's as simple as checking the box in Buildroot, so why not? The version of Glibc I am using has special accelerated memcpy and string functions for my CPU (e6500 PPC, Altivec).
I'm no expert, but at least STM32's default is newlib or newlib-nano, which it inherits from the ARM GCC toolchain[1] from what I gather. I just checked and here it's defined as the following by default:
ldd --version
ldd (Debian GLIBC 2.28-10) 2.28
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.
And of course, on actual microcontrollers newlib-nano is what you're likely using with a GNU toolchain.
I have no clue how widely implemented it is, but as far as NTP is concerned we're currently in era 0 which spans from 1900 to 2036. When we reach 2036, we just move to era 1 and have another 136 years until we reach era 2. NTP itself should be resilient to this kind of issue by design.
We've seen time and time again that it doesn't matter what the spec says, if it works now, noncompliance will be there. Heck we're stuck with TCP & UDP because routers and firewalls will break anything else. If it works long enough, ossification makes it the new reality.
How many NTP implementations don't consider era or contain bugs if era != 0?
There will almost certainly be failures in devices without battery-backed clocks (eg, Raspberry Pi) that boot in 1970 and rely on NTP to get the correct time, because NTP can’t tell them they are in Era 1 not Era 0.
To fix this you need a persistent low water mark on the time, compiled into the NTP program and/or stored in the filesystem (eg the timestamp on the NTP drift file). Then NTP timestamps can be interpreted as spanning the 136 years after the low water mark, using modular arithmetic.
> The NTP protocol will synchronize correctly, regardless of era, as long as the system clock is set initially within 68 years (a half-era) of the correct time
Which means 2036 will appear to work just fine, as 1970 is only 66 years away. Now jump forward to 2038, and this would not be the case..
Chances are, based on how poorly leap seconds and TLS 1.3 (and, I predict, http3/udp) are handled in actual reality by deployed systems, that a variety of software and hardware will fail inside infrastructure worldwide when ntp ticks over to era 2.
I've implemented the 1588 protocol a couple of times, and I have less optimism.
This "era" thing seems like the exact type of thing that someone would ignore when implementing the protocol, or if it is supported, probably doesn't get tested much. It's like leap seconds. Yeah, the leap second info is conveyed in PTP, but most of the implementations I've seen simply ignore it and just jam the clock when the time jumps.
Infinitely many IIRC. Era isn't sent in the protocol, it's determined implicitly - if you know the time correctly to within a few dozen years you know the right era
A device made in 2021 could assume that the current time is at least 2021. Barring the invention of time travel and (a lot more risky) bugs assuming that NTP time stamps can be ordered by comparing them as 64-bit unsigned integers, that would make it OK for up to 2157 or so.
Working with that, conceptually, isn’t difficult. You can use any dateline as the epoch by subtracting the chosen epoch’s NTP timestamp from the one you received with wraparound.
Making a dateline library use that epoch for formatting dates and times isn’t hard, either, but could be a lot of work.
Of course there are ways to make it work. I'm just assuming the vast majority of devs are not aware of this issue and mostly everyone's just solving their time problems by throwing NTP at it. Lots of things will break.
Absolutely. The only way we can be sure that this will be fixed by 2036 is by introducing a new NTP version that doesn't have this problem. Anything running the current version (which could be seen on the wire) would be considered broken.
NTPv4 introduces a 128-bit date format: 64 bits for the second and 64 bits for the fractional-second. The most-significant 32-bits of this format is the Era Number which resolves rollover ambiguity in most cases. According to Mills, "The 64-bit value for the fraction is enough to resolve the amount of time it takes a photon to pass an electron at the speed of light. The 64-bit second value is enough to provide unambiguous time representation until the universe goes dim."
Luckily remediating this is relatively easy: Use some indicator for "certified 2036 proof" NTP clients that's visible on the network (a different port, an extension, ...), then observe the network to find legacy traffic.
Won't catch absolutely everything (e.g. local containers), but will probably get pretty close. Especially if modern NTP client versions have some way to resolve the ambiguity, e.g. a hardcoded "it's after 2020".
> A device made in 2021 could assume that the current time is at least 2021.
As long as the device knows it was made in 2021. But the device can't know that by magic; the manufacturer has to store that information in it somehow. Do all device manufacturers do that?
Probably a bunch of devices will just jump to 1970 tbh.
Through if you do it properly you add some fixed known min
. time (like manufacturing time, or time of the last system software/is update) and if the NTP time is noticable below the min time you increase the epoch by 1.
E.g. in you case the device could know it's revision had a min. time of idk. 2015. Then if it receives 1973 it knows it's 3 years in the next epoch.
But as anyone can guess there will be devices which didn't implement that at all and will end up in 1970. Or which require setting time by hand and will end up in 1970 because it truncates to 32bits at some point or similar.
Through some embedded devices might happen to avoid it. Like due to special reasons they might use a non-unix epoch.
RFC5905 uses 32 bits for era number and 32 bits for era offset in the 128 bit timestamp type. Though I don't think the 128 bit timestamp is used much...I can't find support for it in either ntpd or chrony.
They and everybody else should avoid using timestamp representations for any dates so far in the future. Because, it is a nightmare to translate timestamps when the timezone database is updated. Which you would have to do, if e.g. the EU finally manages to get rid of summer time and you care that dates past that change stay the same.
wait, why? unix timestamps are safe from this kind of mess as they are UTC seconds since the epoch, all the tz-aware conversion happens when you need to display localized dates but the timestamp you save should be as neutral and tz-unaware as possibile. I've seen way more bugs from people assuming system clock to be UTC while it was localized. Like missing and overwritten data on daylight saving changes.
You lose context. Say you have 2100-01-01T10:00:00 in PST. That's 4102452000, so you store it.
A few decades later, timezones are changed. PST should add 30 minutes, all others stay the same. You only have 4102452000, what should you convert it into? You lost the time zone information, so you can't make the correct decision.
Changes to timezone changes can be quite abrupt too, this one was announced about 3 months before it took effect. Developers scramble to update software.
if you need context, e.g. location, you can still save it in another field and you will be able to correctly display localized time at any point given the universal timestamp and the current location. Or you can save properly localized and tz-aware time, I just find timestamps safer from pitfalls.
> if you need context, e.g. location, you can still save it in another field
No. Say I have a database with a bunch of timestamps for future events. I calculate the time of those events with my current idea of local time (because my customer or because legal requirements require me to do something at a specific local time), convert to GMT, and store as a timestamp in my database.
In 2024, California decides to no longer observe Daylight Savings Time. All of those current timestamps no longer happen at the proper time in the newer version of America/Los_Angeles.
So there's a big pitfall here, in that the relation between localized time and UTC timestamp changes.
Sure, you and others like you make pretty valid points. I believe we are talking about different use cases, hence the confusion. I was thinking of my most common use case of archiving events, usually experimental data, with a non ambiguous time reference.
If you need to schedule future events it gets obviously a lot more complicated, you don't need a timestamp you need local time and you need it to be robust to future changes in timezones, DSTs, politics.
Basically, it's important to recognize what you're actually measuring/capturing and store something maximally equivalent to that. Conversions may render figuring things out later impossible or very difficult.
If you care about a past moment in time as reported by GPS or a computer clock-- use a UTC timestamp.
If you care about a future time delineated by a precise interval from now-- use a UTC timestamp.
If you care about a future time expected to be delineated in a specific time zone, use a timestamp in local time and note the zone.
If you care about a future time in a customer's time zone no matter where they may be, store the time in an unspecified local time and the customer for whom the time applies. Etc.
You seem to make a contradicting statement here with what you wrote in the sibling comment, no? timezone information are influenced by political means, whereas localization (lat/long) information is not. UTC timestamp + localization seem more correct to me.
The semantics of "time X at point Y" are different from "time X in time zone Z". Sometimes the former may indeed be what you want, but it's uncommon for the user to be providing precise location information for a point in time far in the future. If you're told "time X, Eastern standard time", then the semantically correct thing to do is to not guess a point in space, but instead preserve the time zone as provided.
This line of thinking is how we end up with nonsensical UX like having to select "America/Los Angeles" to get US Pacific time. There are many use cases for datetimes and timezones, not all of them need a location and in some cases it's incorrect or misleading to include a location. You could certainly use a UTC timestamp and coordinates in your application but that doesn't work for many other use cases.
Your example is valid and replies are ill-informed. There is a difference between a "date" and "time interval". Timestamps are good for storing intervals, but if you need to plan something on a _date_, you need to store the date. This means you care not about specific number of seconds having passed since now, but about what calendar (and the clock) is saying when the event has to happen.
Example: nobody cares if a concert planned on November 19, 2024 at 19:00 in Honolulu starts in 12345600 seconds or 12349200 seconds since now. But everyone cares that everone's calendars and watches are in sync by that time and show specifically that date and that time, regardless of how many times people switched DST or timezones in the years in-between.
Just store the time stamp and time zone separately and update the time stamp to reflect any changes to a time zone's offset (if that actually happens)...
It takes a bit of housekeeping, but isn't especially difficult.
> Just store the time stamp and time zone separately and update the time stamp to reflect any changes to a time zone's offset
Thus reinventing zoned datetimes badly
> if that actually happens
It happens literally all the time, sometimes with very little heads up e.g. the decision to apply DST or not can be take mere weeks before it actually occurs[0], and something as impactful as an entire calendar day disappearing can happen with 6 months notice[1].
Thanks for this - I've learned something new today. I did not realise timezone updates were so frequent.
I do find that having the timestamp at hand makes math easier in a lot of the use cases I've come across. I wonder if there is an easy hybrid solution where you can capture timezone info, timestamp info and have it adjust without a full DB update, even if timezone offsets change. Maybe a mapping table of some kind, with versioned timezone codes.
As far as i'm aware, you should only store the location next to the timestamp and let the IANA[1] do the maintenance of the tz database. Please avoid the housekeeping on your side.
It actually is especially difficult at scale for non-trivial applications. Large databases won't be able to atomically change every instance which may lead to very frustrating bugs where collisions cannot be allowed. For example, on a calendar app. I agree with dtech, if you're storing in the future don't use a timestamp unless the input value is specifically intended to be timezone unaware, which is typically only useful for short term things like control systems or alarms. For longterm human uses a datetime+timezone field or an ISO datetime+timezone string is safer.
For many purposes the correct specification of a future time is in terms of local time at a particular location on that date. This is true for financial contracts (including trillions of dollars in options markets -- 10am in New York means NY time) but also of work schedules (scheduled to arrive at 9am? That is on the workplace clock, not UTC), local transit schedules, and many other things.
That means when daylight savings time rules change or timezone lines are moved the UTC time of these events change.
Unfortunately I don't believe anyone has standardized a format for storing times which are specified local time at a specified location on a specified date. So it is all roll-your-own or use UTC or zoned datetime and be burned when things change.
The problem is that the contract normally doesn't specify epoch seconds. So if the translation of UTC since epoch to local time changes, the local time specified in the contract does not, so your UTC since epoch timestamp has to.
I don't think so. Imagine you have date saved somewhere at 10am 1 July 2031. If the EU abolishes DST in between, do you want the hour changed to 9am or keep it at 10am? I'd say both could be correct.
You convert it to the local time at that location at that point in history. That's something that belongs to time localization, timestamp should be universal and immutable against those changes.
So I'd say you are arguing against using timestamp representations for any dates far in the future since they can change depending on what happens with timezones and DST, but timestamps cannot.
My point is that either you save a properly localized tz-aware time in unambiguous standard format (iso 8601?) or you save a universal timestamp (and location if needed) and defer all the localization process to the moment you need to display localized time.
From my experience the latter is safer because people tend to save datetime strings in a format that's either not standard nor unambiguous.
I don't think you're answering the question here though. It's completely plausible there are situations where the contant is local time, not UTC (or TAI, if you want to get really obnoxious). Let's say I made an appointment to get my hair done in Berlin 10 years in advance, at 14:00 local time 29 Dec 2031, for whatever reason. Let's say someone goes crazy and moves Berlin to +1:30 this decade. I would still expect to show up at 14:00 local, but that's not the same universal time it was 10 years ago. How do you represent that in your DB where everything is in UTC?
Frankly, time doesn't work that way...
if i am atm (CET) and i make an appointment for sometime in summer 2022 (CEST), then i don't expect the hair salon to make the appointment in CET.
Honestly I'm not sure, in ten years Berlin timezone could not exist anymore, in your example you need a special time representation that says "this time in local time, whatever local time will mean in the future". I'm not sure we have that in current date time representation standards, do we? you could still save tz-unaware local time and a location, and figure out in the future what local time means at that point of time and space.
I’ve worked on systems where data has a 100-126 year retention objective. We had the “fun” of dealing with this looking backward as far as 1968!
When we reimplemented the system from an ancient mainframe, we stored the value as a Unix timestamp and a standard representation of local time. It was derived from an old format going backwards and recorded in the new format going forward.
Having both is critical for our successors to figure out wtf is going on. UTC 1/1/70 is an anchor point, and the local representation is canonical at a point in time. I don’t recall how 1968-1970 was handled. Not only do you have to plan for “Berlin time” going away, but for daylight (or double daylight) times changing. Having both entries allows you to reconcile, so the poor bastard figuring out what to purge in 2090 has a fighting chance!
A string, ISO 8601, https://en.wikipedia.org/wiki/ISO_8601
Preferably RFC3339 (which is a specific profile of ISO 8601) but, as noted in a sibling comment, it isn't always appropriate for future dates.
Finance/insurance generally sticks to dates instead of timestamps for most of its data, especially since the effective date might be different from the timestamp date (e.g. a weekend or night transaction having the value date for interest calculations set to the next business day; but you also might have backdated transactions caused by various corrections). Timestamps and timezones are used for logs and auditing, but for settling money it's generally considered that the thing that matters is at the granularity of a whole date.
Of course there are things like HFT where precise time matters a lot, but there you don't schedule things years in advance, all the long-term things like mortgage schedules and insurance terms simply ignore time and timezones.
There are also applications which might reasonably have to process dates 16 years in the future right now. I suspect most of finance, for instance, is on 64-bit systems at this point, or we'd be hearing a little bit more about this.
Perhaps. But things like Log4j might help us: while in the previous century (2k problem) making a potential problem largely go completely away, it might be that in 10 years such an endeavor may be a simple task.
There are a lot of people, even in this very thread, who will say "don't break backward compatibility!" right up until 03:14:07 UTC on 19 January 2038. At SOME point we'll have to let saner people prevail, and actually break backward compatibility. I'm just hoping it happens, you know, sooner rather than later.
I'm exactly of this opinion, it's better for us (programmers/admins/etc) to deal with some level of breakage now (gigantic as it may be), than for actual users to deal with breakage with their future dates.
Bingo. We have the option of recompiling with an older version of glibc while we work out the bugs. Our users won't have the option of running our code in 2037 when the epoch arrives.
> If _TIME_BITS is undefined, the bit size of time_t is architecture dependent. Currently it defaults to 64 bits on most architectures. Although it defaults to 32 bits on some traditional architectures (i686, ARM), this is planned to change and applications should not rely on this.
So it sounds like glibc plans to change the default, just like musl has done.
I think the best way of doing this is a 4 step approach:
1. Old situation, time_t is 32-bit
2. Migration starts, time_t is 32-bit by default, -D_TIME_BITS=64 to get 64-bits
3. Migration continues, time_t is 64-bit by default, -D_TIME_BITS=32 to get 32-bits
4. New situation, time_t is 64-bit
And with a long time between these steps. So glibc is at step 2 while musl skipped step 2 and is at step 3. There should be plenty of time between steps. It might make sense to keep it at step 3 until 2037 or so for very restrictive embedded plaforms etc.
It may be an idea to have a step between 2 and 3 with NO default, to force everyone who compiles their code against your header files to make their choice explicit.
That would be untenable - I wouldn't be able to cleanly `gcc -o blah blah.c` without making an explicit decision, and by extension wouldn't be able to (continue to) compile existing code either.
Rebuilds on every codebase everywhere ever would promptly blow up.
IMHO the revolts would come from two camps, a) bureaucracies for whom a build system change would normally take months, and b) individual devs who would be all like "compilation flags?? in MY defaults?! that's LESS likely than you think!".
Worst case scenario, someone forks glibc, removes the offending requirement, proffers commercial support for their fork... and ends up making bank.
painful yes, but isn't that a win? If it's breaking at build time, that means someone's actually /building/ their app. And the fix is easy enough (as long a they don't set it to 32 bit...).
The bigger issue is all the systems that won't be rebuilt (per several sibling comments).
There is no "no default", your distro will ship with one or the other. Very few people compile and ship glibc, upstream defaults only matter in the way they affect distro defaults.
The macros aren't set when building the libc, they are set when programs using the libc are build. So every program build on a distribution uses either the default or set one of the macros.
> [H]ow does glibc maintain ABI compatibility? Alias attributes?
Symbol versioning[1,2]. Kind of like aliases, but more awkward. In theory it’s not specific to glibc, in practice hardly anyone else bothers being so careful about their ABI.
Note that dynamically linked musl, which Alpine uses (I think..?), doesn’t understand symbol versioning, though I expect this will only last until the first ABI break in musl.
I know that glibc uses symbol versioning to remain ABI-compatible within versions. What I meant was ABI compatibility between 32-bit and 64-bit time_t with the same glibc library. Looks like it uses something like aliases for that (but self-implemented using asm):
musl has lasted almost one-third as long as glibc, with significantly less than one-third the number of ABI breaks. as far as I know, the only musl ABI break has been time64; despite glibc's supposedly superior version handling, musl has completed time64 support with headers only, whereas glibc is still in progress.
I think that would be a bad idea in this specific case.
Why? Because for the majority of codebases, the time_t change won't require any work.
You'd be forcing a compiler error, and it would effectively say "Add -D_TIME_BITS=64. If you're doing something really weird with time_t then you may have to update your code too, but in 95% of the cases, you can just add that flag."
I think the compiler error might be warranted for something like "You are using a function that is 95% of the time insecure or wrong, please add -DGAPING_SECURITY_HOLE if you know what you're doing", but if the error really is just "add a flag, you do not need to think about your code probably", the library authors themselves might as well default it for you.
On transition from step 2 to step 3 there will be a great mess: some libraries in the OS distro are built with sizeof(time_t)=4 assumption and other libraries + application are built with sizeof(time_t)=8 assumption. ABI will break at random boundaries, and not just functions: structures will have incompatible layout etc.
If step 2 is omitted, then similar great mess will happen on transition from step 1 to step 3.
We need a better plan, like modifying compilers (for i686 and other 32-bit targets) to emit '32-bit time_t code' and '64-bit time_t code' simultaneously, and then resolve to proper functions later, at the link time.
Why would you -D_TIME_BITS to something other than the default? It's a hard thing to do safely, since you're choosing to be ABI incompatible (in a way that's not caught at compile or even invocation time) with the system's default.
And then, all you get is a binary that will potentially handle times past 2038 safely, when the whole rest of the system won't.
The thing is, not too much can be expected to break from a 64 bit time_t. Only stuff that directly shoves time_t's over the network or to disk -- a bad practice -- will.
Note anything that builds on x86-64 or other 64 bit architectures has already figured out how to tolerate 64 bit time_t's.
The major reason it's not safe to build with a 64 bit time_t is because you don't know whether other libraries are. If the C library cuts over, when people/distributions move to the new library version, they know that's not a concern.
Defaults change in tooling all the time, requiring code changes for distributions bundling newer tooling/libraries. This one would require less change than most.
Of course, it's possible to survive a 64 bit time_t and still not be 2038-safe. But at least you can be correct if you have a C library and other libraries that will tolerate 64 bit time_t's.
So the fix is to list all libraries in popular distros that break with time_t being 64 bits, file issues for all of them, and track progress somehow. Messing with the defaults is only useful if the ensuing breakage is promptly fixed.
> So the fix is to list all libraries in popular distros that break with time_t being 64 bits
All libraries in popular distros build on x86-64, which in turn means they use a 64 bit time_t there.
It's time for 32 bit architectures to join 64 bit architectures in having a 64 bit time_t.
Anything that ships today in distributions will ship in embedded systems for 5+ years. And then a lot of those embedded systems (too many) will last a long time. 2026 is getting pretty uncomfortably close to 2038.
In practice, all the code in distributions has built in environments with 64 bit time_t. Some end-user legacy code may not be, but it probably isn't much: most things just don't care about a field getting wider behind the scenes.
The big pain, IMO, at this point is the big cutover where all libraries need to move to the new ABI.
This still doesn't prove things as 2038-safe, but it at least means things reasonably can be 2038-safe on 32 bit if they choose... while today it's impossible for practical purposes.
Unfortunately you probably daily rely on multiple 32-bit linux machines, most of the time not realizing you do. I agree this should be mentioned in the title though.
I was feeling the same, about the only thing I have using 32bit userspace these days are a few rPIs, little bit early to be panicking they may suffer from the Y2038 bug, most companies waited to about 6 months before the Y2K bug was going to hit before they started caring.
> I was feeling the same, about the only thing I have using 32bit userspace these days are a few rPIs, little bit early to be panicking they may suffer from the Y2038 bug
Quite the opposite if you're talking about rPIs: the small/embedded space is where things can live for a long time. They're the primary place to worrying about. Especially if you're talking about industrial processes.
I think glibc is right to not change the defaults out from under the feet of users. “ This is the worst possible way to do this. Consider libraries: what if a dependency is built with -D_TIME_BITS=64, and another dependency is built without, and they need to exchange struct timespec or similar with each other? ” - this argument makes no sense to me. If you need new features and new values update your flags rather than breaking any program that depends on your CLI API
> If you need new features and new values update your flags rather than breaking any program that depends on your CLI API
The thing is, 2038 is less and less far away with each day that goes by. Especially since real programs often have to use future timevalues (and often further in the future than one would think).
Stuff has to change at some point. A glibc major version which changes the structures and typedefs for all 32 bit code could handle it.
But as it stands now, someone who wants to make a 32 bit program y2038-compliant will have a hard time doing it: they can't safely hand time values to any library that is possibly compiled differently.
This punts it to whom, distro maintainers? They have to try and rebuild everything with the correct define? And the lone end-user who does gcc -o myfile myfile.c gets code that's broken against the system libraries? Or they patch glibc itself and ship a glibc that's ABI incompatible with everyone else. Meh.
It is only changed from under their feet if it is done without communication. Most important is to communicate a plan. Tell the users that at from date X, all new releases will have the default changed.
> Sure. And the ABI needs to break sometime before 2038.
That really highlights the benefit of the BSD model, where the OS, libc and all the base packages are shipped as a complete system. OpenBSD switch to 64 bit time_t on all supported platforms back in 2014. We all talked about Y2038 back then, and most agreed that something should be now as quickly as possible. Then the world just sort of forgot. I mean it is solved, but if you're a new developer, you might not know that you need to do something special.
The GNU and Linux world is really good at not breaking ABI compatibility and mostly that's great, but in this case it's a problem and pushing it much further is going to be a problem. It's also going to be weird if the default never change, then we just have a C library that that just keep producing defective programs, unless you add certain flags.
There are a large amount of 32bit devices still being designed and built. Not for servers, desktops or laptops, but embedded devices, controllers and so on. These are actually the worst device to have the issue in, because they will last longer than 2038. If you ship an 32bit embedded device today, there's a reasonable chance that that device will be in service 20 years for now.
Isn't glibc, specifically, very careful about ABI compatibility?
You can still run Linux programs linked against decades-old versions of glibc on current systems as they have kept ABI compatibility with the use of symbol versioning, without changing their SONAME ("libc.so.6").
Sure, that does not extend to being able running programs linked against new glibc symbols on old systems, but if you consider that ABI-breaking then the Linux kernel would also be ABI-breaking as you cannot run binaries relying on new kernel syscalls (etc) on older Linux kernels.
The problem doesn’t start in 16 years, it’s happening now. Try to do date/time calculation after the overflow today. This can have major issues with ensurance companies right now.
> Consider libraries: what if a dependency is built with -D_TIME_BITS=64, and another dependency is built without, and they need to exchange struct timespec or similar with each other?
What happens in the reverse case on Alpine (or anything using the other approach)? If I build a new program but link against a dependency that predates the switch (say, I upgraded my workstation to a new Alpine release but didn't `make clean`), will I get the same breakage?
> If I build a new program but link against a dependency that predates the switch (say, I upgraded my workstation to a new Alpine release but didn't `make clean`), will I get the same breakage?
If you jump between incompatible versions of a C library without rebuilding, you can expect nothing to work.
> If you jump between incompatible versions of a C library without rebuilding, you can expect nothing to work.
Is there a reason musl introduced this in a non-major version? Your point here is perfectly valid, but to me the next obvious question is "what qualifies as an incompatible version?", and after doing some Googling I'm surprised it's not very clear to me. I would have assumed that going from 1.1.X to 1.2.0 was compatible, or IE. two programs compiled against 1.1.0 and 1.2.0 will work fine together, but clearly that's not actually the case.
musl created new symbols for all time_t related functions. This means that an 32-bit time_t application can link against any version, because the 32-bit time_t symbols did not change. The problem is when code passes around time_t to non-musl code - you can't mix 32-bit and 64-bit time_t then, because those are different types.
Right, I understand that, but this just gets back to my question of "what qualifies as an incompatible version?". Clearly 1.1.X and 1.2.0 are compatible in the sense that they still expose the same ABI for 32-bit time_t, and thus programs compiled against 1.1.X will still work with 1.2.0. But as the musl devs identified time_t is commonly used in lots of other places besides libc, so in practice this change requires recompiling all code against 1.2.0 to ensure it is using the right size of time_t, else you might have have programs attempt to communicate with each-other using different sizes.
My question is then is this an "incompatible" release or not? From the version number alone I would have assumed 1.2.0 doesn't require a full recompile of everything, but the commenter I responded to suggested it is 'incompatible' and I don't understand why that is the case with feature releases. Do you need to recompile the world for every musl update and ensure everything is compiled against exactly the same version? Or just the feature version has to match?
The tricky thing with library versioning that you get into is when types change.
Let's say my app uses libc and libfoo, both from the underlying operating system / distribution.
libc has a 32 bit time_t, and libfoo on my system also has a 32 bit time_t. I upgrade to a new operating system where both libc and libfoo have 64 bit time_t. libc was kind enough to include symbol versioning which makes time(NULL) return a 32 bit number to my binary.
Now, libfoo either has to be patched to handle symbol versioning and have a new version inside so that my app can get the old 32 bit time_t, or it will have a subtly broken ABI.
The thing is-- who does this? Libfoo doesn't know when each distro will make the change: it won't correspond to a specific upstream version. Does each distro have to recognize the issue and introduce symbol versions? Etc.
Right, I get that point, I'm simply asking why musl introduced this change in 1.2.0 - why wasn't this 2.0.0? The fact that the 1.1.0 and 1.2.0 types are incompatible in a significant way seems counter-intuitive to me, and even after looking into it I don't see much in the way of describing how you're supposed to handle musl upgrades. Is the expectation that you recompile everything for every new feature release of musl? I couldn't find that spelled out anywhere but it seems like that is the case, or else stuff will be broken via changes like this one.
I don't know, building reliable software is a lot of boring work. Most software the industry develops is not going to space or safety critical, it's okay if they have bugs like this one.
If I were to write this post, I would take the time to: Check _why_ glibc maintainers have made this decision, and ask them why it is reasonable for this to be the case right now.
And they dropped the approach because few to none were going to rewrite existing code to add support for non-standard extensions to fix 40 years out issues. Instead they moved to 64b time_t in 10.7 I think?
And it’s not a new call, dozens of functions touch time_t, plus a few syscalls.
64b code does not necessarily imply 64b time_t though. If your time_t is an alias for int, it's 32b on anything short of (S)ILP64, which is anything but common. I'm not sure there's been any other than Cray's UNICOS and HAL's Solaris port.
This is exactly what glibc did. If you want you can use the 64-bit types and symbols directly. The _TIME_BITS=64 macro redefines the standard C & POSIX time things to point to the 64-bit variants, so you can just recompile code to use them.
For open source software, it’s a simple recompile. Most OSS compiles are 64-bit these days, where time_t has always been 64-bit. In the case of compiling a new 32-bit application, -D_TIME_BITS=64 apparently needs to be a compile time flag.
For binary software, Windows has had a Y2038 compliant proprietary API since Windows NT in the 1990s; most Windows applications use this API so Y2038 is generally not an issue.
The issue only affects a subset of 32-bit binary-only applications where we don’t have the source code any more. Hopefully, any and all applications like that will be decommissioned within the next couple of years.
I think you misread the root comment, it suggests a new function call that no one will use. Apple made that mistake, then they just switched the size and dealt with the fallout.
Looks like I lost the context. In terms of the context:
The issue is, besides having to rewrite code, it’s not just one function. It’s time_64(), but now we need gmtime_64(), strftime_64(), stat_64(), and so on for any and all functions which use timestamps.
The thinking in Linux land is that we won’t have 32-bit applications come 2038 where this matters, because everything will be 64-bit by then.
What’s the alternative to rewriting? Just recompiling?
Assuming the code on the receiving end stores the value in 32 bits (and not a time _t which can magically change meaning but a 32 bit integer) then it’s still doomed without rewrite?
I mean even with time_t use you can’t just recompile and make it work because there could be subsequent assumptions that the size of a struct containing a time_t will be a specific size or allocations will be too small or misaligned and so on.
But the problem isn't really changing the OS to return a 64bit value where it used to return 32, the problem is all the applications that assumed it would be 32.
By "pulling it off" you mean, they changed it and planes didn't crash around them?
That still seems like software under the “control” of the OS maintainers. But most apps running on an OS are never seen by the people maintaining the OS.
That’s why it’s such a tricky move to break compat with those apps X because you cant know what they are doing and how.
Lots of code would break because they assume they can do signed math with time_t, though. It's a less invasive change to make it wider: largely, it's just a recompile, except for code that persists a time_t directly or sends it directly over the network (and both of these should be considered harmful for other reasons).
The other problem is that more unsafe code in libraries, etc, will happily cooperate with 2038-safe unsigned time_t code, but will start to do bad things shortly before 2038.
A) will be fine until 2038. I assume we have a lot of things that mashes time into an int or long. But such code is no worse off by virtue of the C library types being fixed.
B) -- manual calculation of size of structs instead of sizeof() -- yah, maybe it'll happen. I don't see much code this bad. If it's ever compiled on a different word length it's already been fixed.
C) Perhaps. For the most part alignment improves when you have a wider time_t, but you could have people counting the number of 32 bit fields and then needing 16 byte alignment for SSE later. Again, for the most part this penalty has been paid by code compiling on amd64, etc.
Honestly, for network code it makes more sense when the data is received to add the time to the current era. A 64 bit timestamp is an extra 4 bytes of overhead. However, the biggest issue with a network protocol is you just can't force everyone to update everything.
Now it's some in house protocol sure you can just update everything instead making the timestamp a relative value to the current era.
For code running locally 64 bits is less of issue, just mainly a problem of ABI breaks.
Honestly, one thing I think people overlook is file-formats... A lot them have 32 bit time fields. Also unlike a network packet the file could actually be from a previous 32 bit era. So those timestamps are ambiguous after 2038.
Half measures just create even more headaches down the road. Migrating to 64 bit time_t basically solves the problem once and for all. If you're going to make a change, make it the last change you'll ever need.
I'm also in favour of adopting IPv6 ASAP, but so far that has been a much harder sell.
Optimistically assume we as a species manage to survive to the point where distance, or relativistic speed differences, cause sufficiently frequent change in the observation of time passing that a single number, of any size, is no longer sufficient.
time_t is sufficient within bounds. It is expedient and quite correct in many computer science use cases. It can be extended with small additions for many other use cases.
However those bounds are a set of assumptions and simplifications that shouldn't be forgotten. I agree that the problem would be solved until the next paradigm shift in our understanding of time and the universe, and maybe forever if it turns out that the rules are cruel or we're too stupid to reach a more complex situation. I just wouldn't say once and for all, there's far too much uncertainty there.
Ruler of great size, less useful when measured aspects are a pile of disconnected threads rather than a canvas that is mostly shared and mostly distorted the same way.
Distance won't be a problem. 2^64 seconds takes us past the point where the expansion of the universe is such that anything you are not gravitationally bound to is outside your cosmological horizon.
You'll be in a much bigger universe, but it will be empty except for your local galaxy group.
The tradeoff there is that you would be unable to use time_t to express times before 1 Jan 1970 (iinm). That may or may not be important depending on use case.
Yes. We're talking about growing time_t from int32_t to int64_t, instead of uint32_t. If you change it to uint32_t behind the scenes, some code will silently fail while compiling OK, because it was not expecting unsigned math.
> This can be observed with the large file support extension, which is needed to handle files larger than 2GiB, you must always build your code with -D_FILE_OFFSET_BITS=64. And, similarly, if you’re on a 32-bit system, and you do not build your app with -D_TIME_BITS=64, it will not be built using an ABI that is Y2038-compliant.
_TIME_BITS=64 is not working for me on an Ubuntu 18 system based on glibc 2.27 (three plus years old), and I see nothing in the header files that switches time_t.
This must be something new?
_FILE_OFFSET_BITS is old, on the other hand.
Edit: I see in the git log that this has 2021 all over it:
Anyway, it's too early to see if this is "bad". The unknown quantity is the behavior of distro people. Distro people have the ability to override this default, such that all the packages have 64 bit time_t, and the toolchain is configured to build that. I have some faith in distro people.
From my experience, the libc is not even the most scary step… It's the first, necessary step, but not sufficient. Converting a code base that has serialized data structures with 32 bits timestamps that are not even time_t (which is a good choice since you want serialization stability), now that makes for complex upgrade paths.
I know this is a joke but it highlights an issue with the 2038 bug - people think it'll be a problem in 2038. It's very badly named. It should be called something like "the future date problem", because it affects any date that gets converted to a timestamp that represents a time after 19 Jan 2038. I wrote code for calculating the payments on 40 year mortgages back in the early 2000s and I had to consider this problem then.
If you're on a GNU/Linux system that is not actually relevant unless you've also opted into glibc's 64b time_t, and recompiled everything locally: by default, glibc uses 32b time_t even on 64b machines.
> Currently it defaults to 64 bits on most architectures. Although it defaults to 32 bits on some traditional architectures (i686, ARM), this is planned to change and applications should not rely on this.
I test for these things when building open source software.
• 64-bit compiles and applications have a 64-bit time_t in Linux. If your distro and binaries are 64-bit, there isn’t a problem.
• 32-bit compiles and applications still have a 32-bit time_t in mainstream Linux libraries, so that old binaries still run. The timestamp is a rolling one; once Y2038 is hit, the timestamp will be a negative one, but one which is updated and is off by 136 years. The workaround is code like this (this is real production code in my Lua fork, Lunacy[1]):
time_t t;
int64_t tt;
if (lua_isnoneornil(L, 1)) /* called without args? */
t = time(NULL); /* get current time */
else {
// Lunacy only supports getting the current time
lua_pushnil(L);
return 1;
}
if(t < -1) {
tt = (int64_t)t + 4294967296ULL;
} else {
tt = (int64_t)t;
}
if (t == (time_t)(-1))
lua_pushnil(L);
else
lua_pushnumber(L, (lua_Number)tt);
return 1;
This gives us accurate timestamps until 2106. If post 2106 compatibility is needed, we can add 2 ^ 32 again for timestamps with a positive value once the Y2038 rollover happens.
• Legacy 32-bit Windows applications using the Posix compatibility layer are not Y2038 compliant. Once Y2038 hits us, the timestamp will always return -1 (in Windows XP, the behavior was to return a number off by 136 years, but this changed in Windows 10). The workaround is to use Windows’ proprietary API for post-Y2038 timestamps. Again, from Lunacy which has a Win32 port:
/* Convert Windows "filetime" in to Lua number */
uint64_t t;
FILETIME win_time = { 0, 0 };
GetSystemTimeAsFileTime(&win_time);
t = win_time.dwHighDateTime & 0xffffffff;
t <<= 32;
t |= (win_time.dwLowDateTime & 0xffffffff);
t /= 10000000;
t -= 11644473600LL;
lua_pushnumber(L, (lua_Number)t);
return 1;
Now, last time I researched this, there wasn’t a “int64_t time_64bit()” style system call in the Linux API so that newly compiled 32-bit binaries can be Y2038 compliant without breaking the ABI by using “time_64bit()” instead of “time()”. This was based on some digging around Stackoverflow just last year, and simple Google searches are still not returning pages saying, in big bold letters “-D_TIME_BITS=64 when building 32-bit apps”.
[3] To ruin a classic sadistic interview question for sys admin roles, Linux these days returns both the modification time and the mostly useless “status change” timestamp. Facebook once decided to not move forward because I said that file timestamp was “modification time” and not “status change”; if Facebook is still asking that question, their knowledge is out of date.
But then Y2K was presented as the end of the world, with supposedly nothing being ready, and the world at large was going to experience catastrophic supply chain failures, people would be dying in hospitals, etc.
Then Y2K turned out to be a nothing burger (I'm not saying some systems didn't misbehave left and right but in the grand scheme of things, it was a non event).
I'm pretty sure it's going to the same with Y2038.
Y2K was not a disaster precisely because there was panic and concern, and programmers spent much effort fixing it, and it was fixed due to prioritization of resources & effort.
"it's going to the same with Y2038" is like saying we can fold our umbrella in a rainstorm because we haven't gotten wet yet.
When Y2k happened, it also seemed a little ridiculous to laypeople because they thought, "Surely we don't depend on computers that much," and there was still an institutional memory of doing important business on paper. Nobody is under that illusion now.
The solution here is to simply not use dynamic linking for anything outside glibc and openssl. If your program is statically linked, or if each program ships all the dynamic libraries it links against, this compatibility problem disappears.
I'm assuming that you're also building all your dependent libraries as part of your build process, as is common with languages like Rust. Closed source libraries are somewhat out of scope when talking about Linux distributions, though you can always ask the vendor to provide you a 64-bit compatible library.
Then it's not a distinction between dynamic linking and static linking. It's a distinction of "only link against things you yourself build and ship".
It's nothing to do with closed or open source. I don't want to build a very big world, and ship a huge amount of binary code that can be found in distributions, to make a 2038-safe app.
> though you can always ask the vendor to provide you a 64-bit compatible library
This is not always true, as the vendor may have disappeared and can no longer provide new versions of the library for any reason. Though, if you knowingly have such a black box in your code and willingly keep running with it until 2038 then you kinda deserve what is coming.
But then you have to audit every single application and its libraries for latent Y2038 problems. It doesn't really save you anything except making it more difficult to track down problem applications. They'll continue pretending nothing is wrong until the drop dead date.
Even if upgraded, its is so flawed that it is practically universally replaced anyways.
There are two severe issues which must be addressed for it to be saved. First, the type should be float, not int. The former has an intuitive precision, while the latter does not. Not to mention the sign issues. It is very rare to see a timestamp int which is not used as a kind of fixed point float somewhere in practice, but doing so manually gives rise to endless bugs. float on the other hand, can always be in second, which is much more intuitive, and means the downstream apps fix if precision changes is much easier to get right. For these reasons, upgrading to float64 would be substantially better than upgrading to int64, while being better for most use cases, almost always as fast, and require less thought in downstream.
However, I think we can do even better. Int64 is also too small, as the common nanosecond clocks would only give a century of range, which requires adjusting pretty much every algorithm, even if it only ever deals with a few times. And of course, if I dont care about precession, I want the floating point to deal with it for me.
It almost always makes sense to use a time precision exceeding the most common clock with which it will be used, and with a range covering the most common use cases. Most regular cpus give nanosecond precision, but then float64 would only give a few years of range exactly, which is a bit too short for many applications which want exact precision, and use the standard 1973 start. The long awaited float128 however, would be sufficient for billions of years at pico precision, while maintaining low memory and compute costs. Notably cheaper than int32 was originally. float128 simply makes for a fantastic timestamp format for almost every application. Its extremely convenient compared to the clunky std::chrono, works in C and interop, and is very simple to use.
The only downside is that compiler support, while getting better, remains bad.
Timestamp ints get constantly casted back and forth between seconds, microseconds, and nano seconds. its an absolute mess everywhere. This are fixed point floats.
Everything sensor related, and countless logging, file or anywhere wait_for or timeouts are used are typical usecases, and indirectly whenever people need to do simple things like basic arithmetics on timestamps.
The code will either contain some precision change or a unfixed bug related to overflow of one or more kinds.
So int64 is half, and I was off by a factor of three doing it in my head... I was wrong no doubt about that. But the order of magnitude is the problem, it should be millenia, or millions of years, not a few centuries.
Wonderful, now we'll have NaN issues, and precision issues where 0.3 can't be truly represented. Did I mention floating point trap, and hard/soft float ABI compatibility? Linux apps like calling gettimeofday() frequently, and now it will be slower, esp. when the processor doesn't do floating point natively.
float128 is even worse - it doesn't exist on some ABIs or exist in some other ways, e.g. old Power/PowerPCs used an IBM format. Do we limit them to their own ABI (extra burden on distros), or use double in which case apps expecting 128bits will have quite a surprise.
How are you going to represent 0.3 using a non fixed point integer? And if you do, use fixed point, you suddenly have other just as common numbers you cant represent. You really think NaN is a better is a less intuitive result than what whatint64(1)/int64(0) gives? You really think the latter wont result in a bug as well?
gettimeofday is deprecated as of 2008. So even using it is a bug. Using it more than a few times per seconds is almost certainly a bug. The function also uses an ambiguous representation, but and as the docs dont specify, I would have to check the code to se if it relies on microseconds being less than 10^6, and even if I check this for standard glibc, I would not trust other implementations to do the same. However, as we can assume that the pseudo integer type used does not need to use have multiple codes for basic numbers, its easy to se that the range is just under 52 bits. gettimeofday is so old that they didn't even expect that emulated int64 was available.
That is, if they used a double instead, it would perfectly cover the same range, require much less code, while simultaneously been easier to get right, support all standard operations you would expect, behave more intuitively in many cases, and be faster on most platforms, though as you say, perhaps slightly slower on soft float platforms. Though I woudnt be so sure, almost every project notices float performance, almost no project notices gettimeofday performance.
The ABI would break by changing the type regardless, numeric types should never be used without explicitly specifying precision, and no, the burden is primarily on compilers. With float128 as a core language feature as it should be, noting that while it should be IEEE... compliant, it may be emulated, the burden on the distros would be small, and likely net beneficial as timestamps is one of those things that cause very rare and hard to replicate bugs absolutely everywhere.
>How are you going to represent 0.3 using a non fixed point integer?
I won't try to. The alternative to your proposal is fixed point integer.
>if you do, use fixed point, you suddenly have other just as common numbers you cant represent.. You really think NaN is a better is a less intuitive result than what whatint64(1)/int64(0) gives?
I think it's less of a problem than float. Most programmers (myself included) don't understand all the nuances of floating point.
>gettimeofday is deprecated as of 2008
The reason I mentioned it is because I recall it turned out to be a problem when porting/emulating Linux apps under Windows, since the equivalent Windows call used to be way more expensive. Apparently it was a significant problem since the call was in wide use. I don't recall any widespread effort to remove gettimeofday (or equivalents), so I suspect it's still is in use?
>With float128 as a core language feature as it should be
Unfortunately it isn't a core language feature - e.g. it can have 80bit precision on some x86s/software combinations. Now, that's worth fixing regardless.
Its not standards compliant for C++17 to use the 80bit precision version if I remember correctly. So it is in the languages, the language is just badly supported.
In practice, floating point timestamps have a constant precision throughout their life. That is, the exponent has the same value for a really long time, and you don't get any benefit from the "floating" aspect.
As another poster points out, overflow/NaN/infinity is the one nice thing you get. But you pick up all the idiosyncrasies of floats, too.
As you point out, float64 is worse than int64_t at attaining a given level of precision for a reasonable amount of time.
You do point out that floats "seamlessly upgrade" to increased precision, and this is kind of nice. But if we're picking a 128 bit type, there should be no upgrading needed ever.
Because in every situation I have ever run into a floating point timestamp behaves more intuitively, and when I started converting my programs to use floating point timestamps almost every function that used timestamps lost dozens of hard to understand special cases that accounted for int overflow, unsignedness, etc problems. Problems which every single one had bit me in the ass at least once before I wrote the code managing the special case, and of which nearly every single one had at least one bug in it. See the revisions to std::midpoint for great examples of how hard this is to do right. However, because float128 overrepresents all the older basic type variants, it becomes trivial. If the code could possibly have been made to work for any of them, it will automatically always work for float128.
Its got more to do with how intuitively the timestamp behaves, and that you can use seconds as the unit everywhere and no longer need to keep track of if this wait function took seconds or milliseconds. Using infty to indicate a timeout function should block indefinetly is also much more intuitive than using 0.
Nothing wrong with fixed point. It's only a problem because people end up hand-rolling their arithmetic (and botching it) instead of using library functions that do it right.
Fixedpoint is great, but floating point behave more like how people appear to expect timestamps to work. That said, I like so many others have written my own damned fixed point function for various reasons, and like everyone else I got it wrong several times on the way. If only there was a simple template class that could take care of it for you, but well, such a simple template class would really just be a basic numeric type anyways, so why not include it in the specification, ensuring that a high performance, extremely well tested variant was available everywhere form the start.
Imagine how many millions of hours of coder time could have been saved if the C specification included fixedpoint<20,12> (or however you wanna write it), and thus would have created a single standard implementation for all of them. Having not just basic arithmetics, but decent sin, cos, sqrt, atan2 etc, all parth of cmath. The difference now would be minor with soft float beeing ubiquitous, but back when it was faster... Damn.
Fixed point could work, and they are certainly better than ints, but floating point behaves more like people expect timestamps to work, and the specific fixed-point I need never seems to be available.
Given this article, it seems we're still doing it wrong, and that means... 2038 will be "fun" (either the remediation, or the consequences of the lack thereof).
At least most of us in the industry now have a good retirement plan: Fixing the legacy systems in 16 years...