Tuesday, August 3, 2021

Leap Seconds: Causing Bugs Even When They Don't Happen

Leap seconds are controversial things. Since the Earth does not rotate at a steady rate, over time the Earth could get ahead or behind “atomic time”. Whatever solution you propose for this, someone is going to be unhappy.

I take no position on what the best thing to do is here, except that one day I would like to do the math on the “great leap second gyroscopes” that we could mount near the poles to steady the Earth’s rotation, so we can stop talking about this. We may occasionally have to desaturate these gyroscopes with huge rockets also.

Anyhow, some new minor leap second drama is coming up, and for once we can’t blame astronomers, geologists or the International Earth Rotation Service. Imagine if they ever went on strike, by the way!

Many thanks to Russ Garrett for proofreading, constructive remarks & additional material on negative leap seconds!

Leap seconds & GPS/Galileo/BeiDou

Navigation satellites are fundamentally very precise clocks. We use these clocks to determine our position, and incidentally they also tell us the time very accurately.

Because the time signal they transmit is used for calculations, these three navigation systems broadcast a continuous monotonic clock signal that is not influenced by leap seconds. Leap seconds are for humans to deal with.

However, to be helpful, the Chinese, US and European navigation satellites do transmit when leap seconds happen, and what the offset is between “their” time and UTC. For GPS and Galileo this offset is currently 18 whole seconds. For BeiDou it is 4.

Using this offset, you can use the navigation timescale to very accurately get the current UTC time.

Details

When you receive Global Navigation Satellite System (GNSS) signals, you might think your device is constantly decoding the messages it receives. This turns out to mostly not be true - under real life conditions, many signals arrive garbled. However, if sufficient data has been received previously, the precise timing of the arrival of these signals is sufficient to accurately determine your position.

But from this, you can see that leap second changes can’t just be transmitted when they happen. The receiver might very well only be receiving garbled messages at that point.

Instead, leap seconds are pre-announced months in advance. In this way a receiver is guaranteed to have picked up when a leap second is going to happen, and can apply the new offset to UTC at precisely the right time.

For Galileo (spec), GPS (spec) and BeiDou (spec), the leap second message consists of the following parts:

  • \( WN_{LSF}\): Truncated navigation week number of leap second in the future
  • \( DN\): Day number, where day 1 is Sunday (for GPS and Galileo). For BeiDou 0 is Sunday.
  • \( \Delta_{LS}\): Offset between navigation time and UTC in seconds BEFORE leap second
  • \( \Delta_{LSF}\): Offset between navigation time and UTC in seconds AFTER leap second

These three navigation systems all broadcast time by telling you their week number, plus the number of seconds that have passed within that week. By convention, leap seconds always happen at the end of the day.

Now, “on Earth”, we would be transmitting the entire week number. But in orbit, space in communications comes at a premium. And since leap seconds are never known more than (say) 52 weeks in advance, the leap second message only transmits the last 8 bits of the week number in which the leap second will happen. This means that once every 256 weeks, we hit the same 8 bit week number. Similarly, the day only gets 3 bits in GPS and Galileo (since there are only 7 days, this is fine). Space is tight.

The last leap second happened on the 31st of December 2016. This was GPS week number 1929 and Galileo week number 905. GPS is ahead by 1024 weeks. The last 8 bits of both these week numbers are 137 (decimal).

GPS and Galileo are currently broadcasting the following

Mon, 02 Aug 2021 18:38:35 GPS 12@0: 153534 frame 4 wnLSF 137 dn 7 t0t 319488 wn0t 121 dtLS 18 dtLSF 18
Mon, 02 Aug 2021 18:48:20 gal inav wtype 6 for 2,19,1 dtLS 18 wnLSF 137 dn 7 dtLSF 18

These are the same contents, even if formatted differently. The truncated week number is still at 137, which corresponds to the last leap second on the 31st of December 2016 in week numbers 905 (Galileo) and 1929 (GPS).

It however also corresponds to GPS and Galileo week numbers 1161 and 2185 respectively, which also have 137 as their 8 least significant bits. This is the week that runs from Sat, 20 Nov 2021 23:59:42 - Sat, 27 Nov 2021 23:59:42 (UTC).

Now, a receiver could therefore decide to apply a leap second in that week. But if it were astute, it would note that \( \Delta_{LS} = \Delta_{LSF}\). In other words, this is not actually a leap second. There is no change in offset.

Because there is no actual flag in GPS or Galileo that says “there IS no leap second”, this is how it gets encoded that no actual leap second is imminent. Neither the GPS or the Galileo specification is very explicit about this - there is only a remark that some things are only valid when \( \Delta_{LS} \neq \Delta_{LSF}\)

For completeness, BeiDou currently transmits more or less the same:

Mon, 02 Aug 2021 19:52:55 BeiDou 12: 157974, FraID 5 dTLS 4 dTLSF 4 wnLSF 61 dn 6

This matches BeiDou week 573, which is indeed the last leap second from 2016. Note that the BeiDou day number is 6, whereas it is 7 for GPS and BeiDou.

Bugs

Since leap seconds are pretty rare, they often cause problems when they happen. Large scale Linux deployments have for years been discovering painful bugs related to leap seconds. After a whole decade of strife, in 2016 most of these bugs had been resolved.

We are now encountering the reverse situation in GNSS land. Leap seconds have become so rare that for the first time since 2003 we will have had a 256 week period without leap seconds.

Back in 2003 this happened as well, and Motorola Oncore VP receivers got mighty confused.

Raw Motorola Oncore VP output. Source

Raw Motorola Oncore VP output. Source

In 2015, four out of four BeiDou receivers messed up the day number (dn) difference between BeiDou and GPS/Galileo.

We are now seeing the first bugs trickle in on “the great leap second absence of 2021”. First we heard was that certain U-Blox receivers accidentally report that a leap second is imminent.

Recently we heard of a complicated bug in the very widely used gpsd package with the ominous title “GPSD time will jump back 1024 weeks at after week=2180 (23-October-2021)”. This bug happened because the author assumed leap seconds would be more frequent.

Between now and November 2021 we’ll be sure to hear about other leap second related bugs. When we do, we should remember that GPS and Galileo have been broadcasting the exact same data for years now. Nothing is changing in the information that is coming down from space.

What is changing is that any code that does not check if \( \Delta_{LS} = \Delta_{LSF}\) is going to trigger some kind of leap second behaviour, even though it shouldn’t.

Negative leap seconds

Up to now, all leap seconds have been positive, and they reflect that the rotation of the Earth has been slowing down. Lately however, things have shown signs of speeding up. This might lead to the need for an unprecedented negative leap second.

Some people, especially non-programmers, assume this will all be fine. Meanwhile, some more battle hardened infrastructure developers have been trying to call attention to the pressing need to start testing negative leap seconds. The assumption is that anything that hasn’t happened before will break spectacularly.

On this entirely non-fishy looking URL https://565851109.xyz/ we can read that based on IERS Bulletin A Vol. No. 30, and making some very large, probably unjustified assumptions, at the end of June, 2029, there will be a negative leap second.

Exciting times!

What about GLONASS?

GLONASS is the Soviet GNSS, which survived the fall of communism & is still with us, and even seeing upgrades. GLONASS has a different way of thinking about time. Specifically, GLONASS time is Moscow wall clock time (modulo changes in DST back when they happened). This means that whenever GLONASS broadcasts a time, this time is already leap second adjusted.

For most of the year this means GLONASS control and GLONASS receivers have a rather easy time. But because this leap second adjusted time is also used for orbital calculations, rumor has it that GLONASS can suffer quite some glitches during leap seconds.



from Hacker News https://ift.tt/2VsDK8g

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.