No-Frills GNU/Linux:
Timekeeping,
Timestamping,
Timelogging
Setting the Clock with NTP
Today's technology demands precise timekeeping. The GPS navigation system, for instance, has to allow for the fact that the rate at which time passes depends on such subtle physical factors as the strength of the local gravitational field. The complex physics has a simple upshot: when exactly one day, of 86400 (that's 24 times 60 times 60) seconds, passes on a Global Positioning System (GPS) satellite, a little more than 86400 seconds, in fact an extra 38 or so millionths of a second, passes on the surface of Earth. Small though the discrepancy between the two timeflows is, GPS needs to include it in its computations if it is to give us accurate navigation coordinates. Happily, today's chronometry rises to such challenges. It is expected that the Bureau international des poids et mesures (BIPM) "TAI" timescale, based on the joint output from atomic-clock timing centres in around 30 countries, would take not a day, but rather (to one significant figure) 400 years before it erred by as much as 38 millionths of a second.
Although the cheap quartz oscillators that track time in personal computers can drift by several seconds in a single day, GNU/Linux has ways of keeping our own chronometry reasonably exact. As I'll now explain, simple procedures are enough to keep us - at any rate in the hour or so immediately following a clock correction - within a few hundreds of milliseconds of BIPM.
Basic GNU/Linux chronometry starts with our synchronizing our software clock, and optionally also our hardware clock, with some "Stratum 2" time server under Network Time Protocol (NTP).
A "Stratum 1" server is a device linked directly, rather than by TCP/IP network, to a reference clock - in other words, to a clock itself directly regulated by GPS, by a WWV shortwave transmitter, or by some other closely BIPM-compliant system. A "Stratum 2" server is a device synchronized under NTP with a Stratum 1 server.
We, as ordinary members of the public, are not encouraged to point our workstations to Stratum 1 servers. We are, on the other hand, welcome to use those Stratum 2 servers whose administrators have set up a policy of open access. (How severely does accuracy degrade once NTP enters the chain? Whereas a Stratum 1 server can be expected to stay within 1 millisecond of the BIPM standard, a Stratum 2 server can be expected to deviate by on the order of 10 to 100 milliseconds.)
Computer scientist David L. Mills of the University of Delaware maintains a list of Stratum 2 NTP servers, including many with open access, at http://www.eecis.udel.edu/~mills/ntp/clock2a.html. He supplies e-mail addresses of administrators, remarking that contact should be made upon establishment of "regular operations" with any open-access server. In practice, "regular" here means that we should e-mail the person maintaining a Stratum 2 open-access NTP server if we are proposing to connect more than once in every, say, xxxxTO_BE_FILLED_IN----HUNDRED? THOUSAND???xxxx minutes.
My own use of NTP shows what can be done with the crudest of setups,
namely PPP dialup on the current, or "Woody", stable branch of
Debian, specifically Debian 3.0r1 for Intel x86.
I used a similarly crude setup in 2001 and 2002 under
Mandrake 7.2. In 2000 or the late 1990s,
I invoked the now obsolescent rdate
under the RFC 868 protocol, a predecessor of today's NTP.
Under my present, Debian, setup, I consider my working day incomplete unless I somehow find a minute to launch my basic maintenance script, written in bash, and scrutinize its effects. As is standard good practice, I (of course!) use my script to check the version numbers of all my installed packages, downloading necessary updates from a central security server. But more to the point, in that same script I interrogate a certain NTP server, physically housed on my local university campus. For enhanced readability, I'll break the pertinent script line into several lines here, and will also respect the confidentiality of my chosen server by altering its name to the meaningless "roohar.goozarcollege.ca":
date; ntpdate roohar.goozarcollege.ca; date; hwclock --systohc --utc --debug; hwclock --show
When the long script line executes, it first synchronizes my system clock (the "software clock", which runs only when the workstation itself does) with roohar.goozarcollege.ca, then synchronizes my real time clock (the "hardware clock", which runs as long as the CMOS battery functions) with the software clock.
It is the software clock
that gets consulted by processes curious about the time of day, such
as the shell-prompt command date
.
The hardware clock, on the other hand,
gets consulted only at boot time, when the software
clock is started up.
By keeping software and hardware
clocks separate, GNU/Linux allows us,
should we so choose, to correct our software clock, and therefore to
change the time presented to processes such as date
,
without
perturbing the underlying hardware.
A cautious system
administrator
might not like
my habit of perturbing the hardware clock once
a day through
hwclcok --systohc
.
That mildly dirty habit can allegedly cause
trouble if clock corrections get large. You, for your part,
might find it prudent to
run hwclock --systohc
more often than I do,
or alternatively not to run it at all.
Further, you might find it prudent to use the command
ntpdate -B roohar.goozarcollege.ca
in place of my plain
ntpdate roohar.goozarcollege.ca
.
The -B
switch
makes the system clock slew smoothly
and slowly to the
correct time instead of jumping. I myself
do like the jump,
since it lets me get a visceral appreciation of clock drift
as I eyeball the second hand of an xclock
on my desktop while my script executes. A
five-second jump is not unusual in my setup. So far so good:
in several years of operation, I've yet to get any
complaints from processes confused by system-clock
discontinuities.
My daily clock-maintenance routine -
appropriate, as I say, for the free-standing PPP dialup workstation -
marks pretty well the lowest level
in sophistication.
A LAN might, more ambitiously, incorporate a
time server foo.yourdomain.com, synchronized
every eight or six hours with
roohar.goozarcollege.ca
by way of a cron job
invoking ntpdate
.
The other machines in the LAN could keep their
clocks reasonably precise by running
ntpdate
nightly, under cron, to
interrogate foo.
What if all the machines on your LAN need to have their software clocks
achieving centisecond or millisecond tolerances at all instants
of the day? In that case, according to the gurus,
you may reach for ntpd
in
place of ntpdate
.
Further, if you are running a free-standing
workstation with PPP dialup,
you could consider using
Richard Curnow's chrony
in
place of ntpdate
. Among the various
appealing features of chrony
- available in Debian Woody
as one of the "Xtra" packages,
but not (yet?) running as production-grade machinery
on my own workstation - is a provision for
automatically interrogating an NTP
server whenever you connect to your ISP.
Timestamping
Once we have our software clock maintained in some appropriate way by
NTP, we are ready to set up a rigorous system of timestamping. People
around the world write timestamps in
different ways, with "02/03/04",
for instance, meaning "February 3, 2004" in the United States, but "2
March 2004" in the United Kingdom. The Geneva-based International
Organization for Standardization, or ISO, brought order out of this
chaos with its standard ISO 8601:1988, recently revised as ISO
8601:2000. (That "ISO", by the way, is not strictly an acronym, but a
mnemonic for the classical Greek "isos", "equal".) Although the
standards are not readily available for free, many Web authors present
their essence. The work of this able community of authors is in turn
summed up by a well-maintained hyperlink bibliography on a Directory
Mozilla (DMOZ) page. To find that page,
you can point your browser either at
DMOZ itself,
http://dmoz.org, or
at the DMOZ-derived search area under
the "Directory" tab on
http://www.google.com. You
can then
input some such
search string as ISO 8601
.
If things have not changed at DMOZ since 20030613T003120Z,
when I last did the experiment, you'll find the ISO 8601 page
in the Google presentation of DMOZ on the category path
Science > Reference > Standards >
Individual Standards > ISO 8601
.
And there we have it - the very sentence you just read uses one of the ISO 8601-kosher timestamp formats! Since, as I say, the DMOZ-linked authors present the essence of the system, it is enough for me to present here just the essence of the essence.
Loosely speaking, the "Z" in this particular format is a reminder that
we are taking the standard time of the zero, or Greenwich, meridian,
as in the archaic astronomy-defined "Greenwich Mean Time". In strict
accuracy, the "Z" means that we are using atomic-clock-defined
Universal Coordinated Time (UTC),
a system based on the BIPM TAI. It is UTC, with its
leap seconds inserted or deleted against the unvarying
beat of TAI,
so as to allow for slight year-to-year irregularities in Earth's
rotation, that underlies common civil timekeeping. The Eastern
Standard Time of winter civil life in Toronto, Boston, or New York,
for example, is defined as the system that lags UTC by exactly five
hours - as in the familiar e-mail header
line Date: Mon, 17 Mar 2003 15:48:32 -0500
, for
a mail transmitted at 15:48:32 EST, or 20:48:32
UTC. It is likewise UTC that my little
ntpdate
-invoking script uses,
thanks to the --utc
switch
in the clause that initiates an update of
the hardware clock.
This is the point at which to remark on a minor scandal in contemporary chronometry. As UTC is currently defined, an incurable lack of specificity infects the timestamping of future events. We literally do not know what second we are referring to when we ask what will be happening at, say, 20531225T121314Z! That's because we cannot predict upcoming irregularities in Earth's rotation, and so cannot predict when leap seconds will get interpolated or deleted, against the invariant TAI, by the authorities in upcoming decades. Metrologists have grappled with such issues for some years. Their next big discussion of UTC, under the aegis of the International Astronomical Union Commission 31 (Time) at the IAU General Assembly in Australia, is set for July of 2003. Maybe some core UTC concepts will change in the months or years following that discussion, maybe not.
The rest of the ISO 8601
format is almost self-evident. We write the year
first, using all four digits, followed by a two-digit month and a
two-digit day, in order to ensure that the ASCII sort order of
timestamps matches the chronological order of the instants the
timestamps denote.
The T
, separating the two day digits from the two
hours digits, is the one part of the ISO prescription that could be
called purely cosmetic.
Among the further notations made available by ISO 8601 is a not-quite-so-compressed year-month-day-hour-minute-second format with hyphens and colons:
2003-06-13T00:31:20Z
Moreover, ISO 8601 supplies formats useful in special branches of commerce, where we have to refer explicitly to the seventeenth week of a year or to the two hundred fifth day of a year.
When I want to produce a short-form timestamp at the command-line
prompt, I invoke /bin/date
with a
tiny shell script, which I call utc
:
#! /bin/bash TIMESTAMP=`date -u "+%Y%m%dT%H%M%SZ"` echo "${TIMESTAMP}"
When I wish to produce a more verbose display, I invoke /bin/date with
a different script, which I call utcv
:
#! /bin/bash TIMESTAMP=`date -u "+%Y%m%dT%H%M%SZ"` echo "Universal Coordinated Time (= UTC = EST+5 = EDT+4): ${TIMESTAMP}
In practice, I seldom use the utc
script. But I do
find myself using utcv
many
times a day, while composing e-mails
within vi under mutt. Here the convenient syntax (with vi in command,
as opposed to insert, mode)
is :r !utcv
. That puts a verbose timestamp,
such as
Universal Coordinated Time (= UTC = EST+5 = EDT+4): 20030318T043915Z
into my file, in a line of its own right below whatever line my cursor happens to be on.
Although I don't often cause my shell to display a plain-vanilla UTC
timestamp at the command line, I do make constant use of a backing-up
shell script that copies
foo.bar
to
foo.bar____BAKCCYYMMDDThhmmssZ
,
for the appropriate
CC
,
YY
,
MM
,
DD
,
hh
,
mm
, and ss
. (That
CCYYMMDDThhmmssZ
, incidentally,
is the the ISO-approved schema for
talking about timestamps in general, as when we find ourselves writing
the documentation for some timestamping software.) I call my script
b
. Cosmetics
aside, the script consists of the single line
cp $1 ${1}____BAK$(date -u "+%Y%m%dT%H%M%SZ"
A typical invocation is b .bashrc
.
The script leaves a clean audit trail, letting anyone see on what
dates I modified a file. For example, here is a part of the result of
listing my numerous backup copies of .bashrc
:
.bashrc____BAK20030207T153646Z
.bashrc____BAK20030207T155409Z
.bashrc____BAK20030220T150959Z
Timelogging
It is natural to combine timestamping with timelogging. I started timelogging with pen and paper as a young university student in the 1970s. Since 1997, however, I've settled on a lean, mean GNU/Linux formalism. My timelogs reside in three files.
(a) My log of days is a day-by-day review of an entire week, with various categories of activity for each day. My own particular categories (but no two people will think alike when they devise categories) are the following:
-
CHURC
: Catholic spiritual activities outside the formal confession and Mass obligations -
MATPH
: studies in mathematics or physics -
AS-LO
: low-grade work in astronomy (for instance, acquiring raw data at the telescope) -
AS-HI
: high-grade work in astronomy (for instance, reading some theoretical astrophysics) -
ASTVO
: a certain bundle of volunteer-work projects for a certain group of astrophysicists who need my help -
ESTBK
: a certain volunteer-work project for a certain community organization -
MAINT
: maintenance (for instance, filing loose papers into Manila folders) -
$LING
: vocational, as distinct from scientific, training, for the most part at the interface between publishing and computer engineering -
$$$$$
: activities related to my part-time business -
TOTL+
: the grand total, as the sum of all the foregoing categories, plus little scraps of working time too trivial to fit into any of those categories
My file is organized as a piece of very plain ASCII. Here's how the file looks for one particular, embarrassingly unproductive, week in the late northern-hemisphere winter of 2003:
2003A_WK09 (2003-03-02 TO 2003-03-08) CHURC | MATPH AS-L0 AS-HI | ASTVO ESTBK | MAINT $LING $$$$$ || TOTL+ ====================================================================== = SUN 0hh00 | | | || 03h00 MON 00h17 | 01h08 | 00h40 | 00h13 || 03h36 TUE 00h16 | 02h25 02h36 | | || 05h20 WED 01h18 | 01h01 | | 00h20 || 06h19 THU 00h21 | 02h23 00h39 | | 00h20 || 07h30 FRI 00h22 | 00h10 | | 02h09 || 06h55 SAT 00h24 | 00h35 | | 02h02 || 07h48 ======================================================================= TOT 02h58 | 06h57 04h00 00h00 | 00h40 00h00 | 04h31 00h00 00h33 || 40h28
I find it useful to fill in each day's row early in the morning of the
following day, and at the end of the week to run a small Perl script
which adds up the seven day rows to produce a weekly-totals row. The
script is a little too long to reproduce here. You can get it, if
you're interested, from the "Technical" section of my site,
http://www.metascientia.com.
(b) Whereas I start a new log of days every week, I keep a single big log of weeks. That file can be thought of as a sequential autobiography. Here's an extract:
2003A_WK07 (2003-02-16 TO 2003-02-22) CHURC | MATPH AS-L0 AS-HI | ASTVO ESTBK | MAINT $LING $$$$$ || TOTL+ 03h41 | 00h00 02h20 00h00 | 00h00 00h00 | 00h04 04h18 19h04 || 41h46 ====================================================================== 2003A_WK08 (2003-02-23 TO 2003-03-01) CHURC | MATPH AS-L0 AS-HI | ASTVO ESTBK | MAINT $LING $$$$$ || TOTL+ 02h03 | 07h54 09h06 00h31 | 00h00 00h00 | 01h23 01h08 15h32 || 40h03 ====================================================================== 2003A_WK09 (2003-03-02 TO 2003-03-08) CHURC | MATPH AS-L0 AS-HI | ASTVO ESTBK | MAINT $LING $$$$$ || TOTL+ 02h58 | 06h57 04h00 00h00 | 00h40 00h00 | 04h31 00h00 00h33 || 40h28
(c) To help me monitor my efforts on many projects over many years, I
keep another large single file - a log of projects. In my formalism,
every project is associated with a unique UTC timestamp,
in most cases denoting the
instant that work on the project started. If, for instance, the
project is commercial, then the magic instant is
liable to be the instant
at which my client started exploring, whether by phone or by e-mail,
the possibility of my undertaking that particular project. For each
project, I track the year, month, and day of fresh activity,
indicating for each day the amount of time invested, the cumulative
time investment, and the nature of the work done on that day. Here are
excerpts,
with some xxxxxxxx
overwriting to maintain confidentiality,
for two simple projects (a piece of ultimately unsuccessful
journalism on Iraq, and a tiny reading project on the
rudiments on radio astronomy):
WRITING__20030226T230203Z____iraq_meeting 20030226: 03h30 -> 0003h30 # attended mtg, did rough notes at xxxx 20030227: 04h37 -> 0008h07 # wrote, polished; submitted to xxxxxxxx
STUDY____20030228T223000Z____radio_astronomy 20030228: 00h31 -> 0000h31 # started Rohlfs-Wilson 20030301: 04h05 -> 0004h36 # read {f.graham.smith} Pelican 20030306: 00h39 -> 0005h15 # ditto 20030310: 00h02 -> 0005h17 # made bare beginning on French book
As I say, the formalism is lean and mean. No software tools are needed, apart from the Perl script which generates weekly totals. I'd be willing to bet that my formalism makes timelogging as efficient as elaborate software does, since what I lose in sophistication I gain in ease of maintenance.
It goes almost (but not quite) without saying that I have fancy
aliases in my ever-so-frequently revised
.bashrc
script, to pull up
any one of these logs on an xterm in a split second. So, for instance,
to revise the project-by-project log that tracks invested time on a
project-by-project basis, I type just the three
letters inv
. That's
a three-letter alias for an invocation of vi
on
a file rather deeply
buried in my byzantine workstation. My .bashrc
implements the alias
with a single line, which I break up into several lines here for
readability:
alias inv='vi
/home/verbum/ANNN____maintenance/
RNNN____journals_etc/QNNN____diaries/
ZNNN____multiyear_analyses_etc/invested_time.txt'
Byzantine? "Anal-retentive" might be an apter characterization of my system for the rational nesting of directories. (Everything, said Einstein, is to be made as simple as possible, and no simpler.) But rational directory nesting involves management of something like space, rather than of time, and is therefore a suitable topic for another essay.