Re: [LEAPSECS] Introduction of long term scheduling

From: Zefram <>
Date: Sun, 7 Jan 2007 16:19:28 +0000

M. Warner Losh wrote:
> But ntp_gettime returns a timespec
>for the time, as well as a time_state for the current time status,
>which includes TIME_INS and TIME_DEL for psotive and negative leap
>second 'warning' for end of the day so you know there will be a leap
>today, and TIME_WAIT for the actual positive leap second itself
>(there's nothing for a negative leapsecond, obviously).

Actually the interface is more complicated than that. TIME_WAIT indicates
that a leap second recently occurred, and continues to be returned
until the command bit that initiated the leap second has been cleared.
The status during a leap second is meant to be TIME_OOP. TIME_OK is the
normal state. If the clock is not properly synchronised then TIME_ERROR
(a.k.a. "TIME_ERR" or "TIME_BAD") is returned instead of the leap state:
the leap second engine still operates in this situation (at least it
does on Linux), but you don't get to see the state variable.

Mills's paper "A Kernel Model for Precision Timekeeping"
<> says that the
time_t should repeat the last second of the day. That would give you
these behaviours:

        398 TIME_DEL 398 TIME_OK 398 TIME_INS
        400 TIME_WAIT 399 TIME_OK 399 TIME_INS
        401 TIME_WAIT 400 TIME_OK 399 TIME_OOP
        402 TIME_WAIT 401 TIME_OK 400 TIME_WAIT

Actually, though, the paper doesn't require the state change to be
atomic with the change of time_t. It's allowed to be slightly delayed.
(It is in fact delayed a few milliseconds on Linux.) So what is actually
seen is this:

        398.5 TIME_DEL 398.5 TIME_OK 398.5 TIME_INS
        399.0 TIME_DEL 399.0 TIME_OK 399.0 TIME_INS
        400.5 TIME_WAIT 399.5 TIME_OK 399.5 TIME_INS
        401.0 TIME_WAIT 400.0 TIME_OK 400.0 TIME_INS
        401.5 TIME_WAIT 400.5 TIME_OK 399.5 TIME_OOP
        402.0 TIME_WAIT 401.0 TIME_OK 400.0 TIME_OOP
        402.5 TIME_WAIT 401.5 TIME_OK 400.5 TIME_WAIT

There is enough information in there to fully decode it, but it means
looking at more states than would be required in the nominal version.
For example, if you see TIME_INS with a time that appears to be just
after midnight, then you're actually inside a positive leap second.
The second that time_t repeats is neither the second before midnight
[399, 400] nor the second after midnight [400, 401], but something
between those, encompassing midnight, approximately [399.005, 400.005].

Interestingly, once you've got all the decode logic required to handle
this, it's also possible to handle a system where time_t repeats the
first second of the next day. This system wouldn't use the TIME_OOP
state at all, using TIME_INS to indicate the leap second. It's just
the extreme end of where the repeated second could be placed.

I think this is a horrible mess and all the leap second adjustments should
happen in user space. (Filesystems that use UTC-based timestamps would
require the kernel to be able to translate too, but this shouldn't affect
the clock or the APIs.) NTP has problems because the leap second handling
was grafted on and it has no memory of leap seconds. Synchronisation
and kernel APIs should use a plain linear count of TAI seconds, because
they're not concerned with time-of-day per se. Anything that does care
about time-of-day can do the conversion itself, just like anything that
cares about day-of-week or day-of-year. UTC is a calendar.

Received on Sun Jan 07 2007 - 08:20:26 PST

This archive was generated by hypermail 2.3.0 : Sat Sep 04 2010 - 09:44:55 PDT