| Author |
Message |
Terje Mathisen
Guest
|
Posted:
Wed Jul 13, 2005 12:27 am Post subject:
Re: stalling the TSC? |
|
|
Nick Maclaren wrote:
| Quote: | In article <daudul$dlh$1@osl016lin.hda.hydro.com>,
Terje Mathisen <terje.mathisen@hda.hydro.com> writes:
|> Nick Maclaren wrote:
|> > In article <p73d5ptpcip.fsf@verdi.suse.de>,
|> > Andi Kleen <freitag@alancoxonachip.com> wrote:
|> > Thanks. I will look at it and see if anyone is learning anything
|> > from experience, even if not history :-)
|
|> I believe I wrote a post about this, it still has several stupid warts,
|> some of which could have made it much more useful if avoided. :-(
A brief look at it indicates that it has more warts that Oliver
Cromwell, and addresses only a small part of the problem. It may
be better than what was there before, but it is still ghastly.
Consider the following 'minor' issues:
Real-time accuracy, including resynchronisation after coming
out of S1 and S2 (sleep?) states. Like, none.
|
My personal favourite (NOT!) was the fact that you cannot latch the
counter to read a consistent 64-bit value on a 32-bit CPU, and there is
no guarantee that it will be usable in 64-bit mode even on a 64-bit cpu. :-(
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching" |
|
| Back to top |
|
 |
Rob Warnock
Guest
|
Posted:
Wed Jul 13, 2005 8:15 am Post subject:
Re: stalling the TSC? |
|
|
Nick Maclaren <nmm1@cus.cam.ac.uk> wrote:
+---------------
| I like femptoseconds - clearly a short interval with no content.
+---------------
Myself, I like units of Planck time[1], approximately 5.391e-44 seconds,
or roughly 1.855e+28 Planck ticks per femptosecond. ;-} ;-}
Since "the estimated age of the Universe (4.3e17 s) is 8.0e60 Planck
times"[1] (just under 203 bits), an absolute universal clock that kept
Planck time could easily fit in a 256-bit counter, thus putting paid to
all of those pesky Y[0-9]* bugs (and especially Y2038!) once and for all!!
-Rob
[1] See: <http://en.wikipedia.org/wiki/Planck_time>
"The Planck time is the natural unit of time, denoted by t_sub_P.
It is considered the smallest possible measurement of time. ...
The Planck time is the time it would take a photon travelling at
the speed of light to cross a distance equal to the Planck length."
Also: <http://en.wikipedia.org/wiki/Physical_constants>
-----
Rob Warnock <rpw3@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607 |
|
| Back to top |
|
 |
Nick Maclaren
Guest
|
Posted:
Wed Jul 13, 2005 1:43 pm Post subject:
Re: stalling the TSC? |
|
|
In article <3v2dnaVBdJ4YAUnfRVn-hQ@speakeasy.net>,
Rob Warnock <rpw3@rpw3.org> wrote:
| Quote: |
Since "the estimated age of the Universe (4.3e17 s) is 8.0e60 Planck
times"[1] (just under 203 bits), an absolute universal clock that kept
Planck time could easily fit in a 256-bit counter, thus putting paid to
all of those pesky Y[0-9]* bugs (and especially Y2038!) once and for all!!
|
But not the year 10^28 bug! We really should have learnt that it
is necessary to think ahead ....
Regards,
Nick Maclaren. |
|
| Back to top |
|
 |
Rob Warnock
Guest
|
Posted:
Wed Jul 13, 2005 3:16 pm Post subject:
Re: stalling the TSC? |
|
|
Terje Mathisen <terje.mathisen@hda.hydro.com> wrote:
+---------------
| My personal favourite (NOT!) was the fact that you cannot latch the
| counter to read a consistent 64-bit value on a 32-bit CPU...
+---------------
Well, there's a standard workaround for that one that goes back to
the ancient days of clock chips on 8-bit busses, which is to poll
the MSW, the LSW, and the MSW again (in that order) until the two
MSW values are the same, e.g., in sloppy C:
struct hdwclk {
__uint32_t upper;
__uint32_t lower;
} *hdwclk_p = (struct hdwclk *) 0x{SOME_HARDWARE_ADDRESS};
struct hdwclk result;
__uint32_t tmp;
result.upper = hdwclk_p->upper; /* prime the pump */
do {
tmp = result.upper;
result.lower = hdwclk_p->lower;
result.upper = hdwclk_p->upper;
} while (result.upper != tmp);
This ensures that hdwclk_p->upper didn't get incremented while you
were reading hdwclk_p->lower.
+---------------
| and there is no guarantee that it will be usable in 64-bit mode
| even on a 64-bit cpu. :-(
+---------------
Now *that's* a serious problem!!
-Rob
-----
Rob Warnock <rpw3@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607 |
|
| Back to top |
|
 |
Andi Kleen
Guest
|
Posted:
Wed Jul 13, 2005 4:15 pm Post subject:
Re: stalling the TSC? |
|
|
Terje Mathisen <terje.mathisen@hda.hydro.com> writes:
| Quote: |
My personal favourite (NOT!) was the fact that you cannot latch the
counter to read a consistent 64-bit value on a 32-bit CPU,
|
There are no CPUs with HPET capable chipsets that don't support some
form of 64bit access (be it MMX or SSE or long mode)
| Quote: | and there is
no guarantee that it will be usable in 64-bit mode even on a 64-bit cpu. :-(
|
Yes, that's a real issue (many of the HPET implementations are 32bit
only) You just have to live with wrapping timers.
-Andi |
|
| Back to top |
|
 |
Guest
|
Posted:
Thu Jul 14, 2005 10:00 pm Post subject:
Re: stalling the TSC? |
|
|
rpw3@rpw3.org (Rob Warnock) writes:
| Quote: | Nick Maclaren <nmm1@cus.cam.ac.uk> wrote:
+---------------
| I like femptoseconds - clearly a short interval with no content.
+---------------
Myself, I like units of Planck time[1], approximately 5.391e-44 seconds,
or roughly 1.855e+28 Planck ticks per femptosecond. ;-} ;-}
Since "the estimated age of the Universe (4.3e17 s) is 8.0e60 Planck
times"[1] (just under 203 bits), an absolute universal clock that
kept Planck time could easily fit in a 256-bit counter, thus putting
paid to all of those pesky Y[0-9]* bugs (and especially Y2038!) once
and for all!!
|
I put that idea up to Dave Mills for a furture version of NTP. He
didn't swear at me, just.
And just ONE set of date/time routines. Hah!
--
Paul Repacholi 1 Crescent Rd.,
+61 (08) 9257-1001 Kalamunda.
West Australia 6076
comp.os.vms,- The Older, Grumpier Slashdot
Raw, Cooked or Well-done, it's all half baked.
EPIC, The Architecture of the future, always has been, always will be. |
|
| Back to top |
|
 |
Peter Dickerson
Guest
|
Posted:
Thu Jul 14, 2005 11:11 pm Post subject:
Re: stalling the TSC? |
|
|
<prep@prep.synonet.com> wrote in message
news:8764vdjpb7.fsf@prep.synonet.com...
| Quote: | rpw3@rpw3.org (Rob Warnock) writes:
Nick Maclaren <nmm1@cus.cam.ac.uk> wrote:
+---------------
| I like femptoseconds - clearly a short interval with no content.
+---------------
Myself, I like units of Planck time[1], approximately 5.391e-44 seconds,
or roughly 1.855e+28 Planck ticks per femptosecond. ;-} ;-}
Since "the estimated age of the Universe (4.3e17 s) is 8.0e60 Planck
times"[1] (just under 203 bits), an absolute universal clock that
kept Planck time could easily fit in a 256-bit counter, thus putting
paid to all of those pesky Y[0-9]* bugs (and especially Y2038!) once
and for all!!
|
There is no evidence that time is quantized in units of the Planck time. The
equivalent Planck mass is much larger than the mass of all known sub-atomic
particles. So just because the Planck time is very small doesn't mean it
will be small enough for all uses.
| Quote: | I put that idea up to Dave Mills for a furture version of NTP. He
didn't swear at me, just.
And just ONE set of date/time routines. Hah!
--
Paul Repacholi 1 Crescent Rd.,
+61 (08) 9257-1001 Kalamunda.
West Australia 6076
comp.os.vms,- The Older, Grumpier Slashdot
Raw, Cooked or Well-done, it's all half baked.
EPIC, The Architecture of the future, always has been, always will be.
|
Peter |
|
| Back to top |
|
 |
Terje Mathisen
Guest
|
Posted:
Thu Jul 14, 2005 11:57 pm Post subject:
Re: stalling the TSC? |
|
|
Rob Warnock wrote:
| Quote: | Terje Mathisen <terje.mathisen@hda.hydro.com> wrote:
+---------------
| My personal favourite (NOT!) was the fact that you cannot latch the
| counter to read a consistent 64-bit value on a 32-bit CPU...
+---------------
Well, there's a standard workaround for that one that goes back to
the ancient days of clock chips on 8-bit busses, which is to poll
the MSW, the LSW, and the MSW again (in that order) until the two
MSW values are the same, e.g., in sloppy C:
|
Sure, I didn't mention this particular workaround simply because it has
been quotes so often on c.arch that I considered it obvious.
Yes, that works.
Yes, it is the only real way to handle such problems.
No, it is not good, since it leads to non-deterministic counter read
latencies. :-(
| Quote: | +---------------
| and there is no guarantee that it will be usable in 64-bit mode
| even on a 64-bit cpu. :-(
+---------------
Now *that's* a serious problem!!
|
What it means is that even on a 64-bit cpu you might be forced to use
32-bit code to sample the counter, with the corresponding need to verify
that it stayed stable during the two half read operations. :-(
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching" |
|
| Back to top |
|
 |
Terje Mathisen
Guest
|
Posted:
Fri Jul 15, 2005 12:26 am Post subject:
Re: stalling the TSC? |
|
|
prep@prep.synonet.com wrote:
| Quote: | rpw3@rpw3.org (Rob Warnock) writes:
Myself, I like units of Planck time[1], approximately 5.391e-44 seconds,
or roughly 1.855e+28 Planck ticks per femptosecond. ;-} ;-}
Since "the estimated age of the Universe (4.3e17 s) is 8.0e60 Planck
times"[1] (just under 203 bits), an absolute universal clock that
kept Planck time could easily fit in a 256-bit counter, thus putting
paid to all of those pesky Y[0-9]* bugs (and especially Y2038!) once
and for all!!
I put that idea up to Dave Mills for a furture version of NTP. He
didn't swear at me, just.
|
The initial (and still current) NTP packet format uses 32:32 bit
fixed-point counters, which gives ~ 1/4 nanosecond resolution. It will
be a few more years before _any_ on-wire protocol requires better
resolution than this.
Even a GBit Ethernet needs almost a us per byte, so getting network
timestamps much better than this will be _hard_.
Anyway, the crux is that the network protocol really doesn't need to
transfer a lot of surplus information in every packet. Quadrupling all
four/five timestamps would add 120 bytes of mostly worthless overhead.
If you really want much better than ns resolution, then there's an
obvious way to do it:
1) On the initial request, set the protocol level field to 5 (i.e. more
than the current 4), and fill in the last N (probably 10-16) bits of the
last timestamp (i.e. normally used for the final reception back to the
requestor) with the extra precision bits.
NOTE! N must either be a fixed value, or some other field, like the top
32 bits of the same timestamp must be used to specify the shift amount
to use.
If we allow the count to be negative, then we can use this feature also
for the first connection startup sequence, where we might not have any
other method to determine which epoch we're in: Setting it to -12 would
still allow us resolution while extending the range from 136 to 557474
years. (In effect this turns the timestamps into a hybrid/block fp
format: Lower resolution for initial connections, _really_ high res
later on.)
2) The server, upon reception of the request, will always fill in the
reference timestamp in the current format.
3) If the protocol level is >= 5 and the reference timestamp is within
2^(30-N) seconds of the incoming originating timestamp, and the
additional timestamp field is filled in, then the highres timstamp
feature can (and will) be used:
3a) All timestamps will be shifted left by N bits, and the
original timestamp will have the extra bits appended.
4) When the packet returns to the client, it verifies that the original
timestamp has been left-shifted and re-assembled, if so it can use the
four 64-bit timestamps in 32-N:32+N mode
| Quote: |
And just ONE set of date/time routines. Hah!
|
If you believe that, then I've got this nice bridge for sale...
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching" |
|
| Back to top |
|
 |
Colin Andrew Percival
Guest
|
Posted:
Fri Jul 15, 2005 12:07 pm Post subject:
Re: stalling the TSC? |
|
|
Terje Mathisen <terje.mathisen@hda.hydro.com> wrote:
| Quote: | Rob Warnock wrote:
[...] poll
the MSW, the LSW, and the MSW again (in that order) until the two
MSW values are the same [...]
Sure, I didn't mention this particular workaround simply because it has
been quotes so often on c.arch that I considered it obvious.
Yes, that works.
Yes, it is the only real way to handle such problems.
No, it is not good, since it leads to non-deterministic counter read
latencies. :-(
|
Only if it is possible for reading the counters twice to take more than
2^32 cycles (in which case you've got larger problems to worry about).
Assuming that the least significant 32-bit word doesn't repeat itself
within this process, the sequence (read MSW, read LSW, read MSW, read
LSW) allows you to determine the value of the complete 64-bit counter at
the time of the last read.
Colin Percival |
|
| Back to top |
|
 |
Joe Seigh
Guest
|
Posted:
Fri Jul 15, 2005 4:15 pm Post subject:
Re: stalling the TSC? |
|
|
Colin Andrew Percival wrote:
| Quote: | Terje Mathisen <terje.mathisen@hda.hydro.com> wrote:
Rob Warnock wrote:
[...] poll
the MSW, the LSW, and the MSW again (in that order) until the two
MSW values are the same [...]
Sure, I didn't mention this particular workaround simply because it has
been quotes so often on c.arch that I considered it obvious.
Yes, that works.
Yes, it is the only real way to handle such problems.
No, it is not good, since it leads to non-deterministic counter read
latencies. :-(
Only if it is possible for reading the counters twice to take more than
2^32 cycles (in which case you've got larger problems to worry about).
Assuming that the least significant 32-bit word doesn't repeat itself
within this process, the sequence (read MSW, read LSW, read MSW, read
LSW) allows you to determine the value of the complete 64-bit counter at
the time of the last read.
One too many reads. You wan't Lamport's algorithm which only requires |
the read MSW, read LSW, read MSW.
If you have control of the hardware implementation you could do a 63 bit
counter in 64 bits which can be atomically read with 2 32 bit reads. I
posted the logic in the x86 asm newsgroup a while back.
--
Joe Seigh
When you get lemons, you make lemonade.
When you get hardware, you make software. |
|
| Back to top |
|
 |
Terje Mathisen
Guest
|
Posted:
Fri Jul 15, 2005 7:04 pm Post subject:
Re: stalling the TSC? |
|
|
Colin Andrew Percival wrote:
| Quote: | Terje Mathisen <terje.mathisen@hda.hydro.com> wrote:
No, it is not good, since it leads to non-deterministic counter read
latencies. :-(
Only if it is possible for reading the counters twice to take more than
2^32 cycles (in which case you've got larger problems to worry about).
Assuming that the least significant 32-bit word doesn't repeat itself
within this process, the sequence (read MSW, read LSW, read MSW, read
LSW) allows you to determine the value of the complete 64-bit counter at
the time of the last read.
|
OK, you _can_ use conditional moves to generate the final count without
using a variable number of cycles and without looping or branching:
msw0 = readmsw(); lsw0 = readlsw();
msw1 = readmsw(); lsw1 = readlsw();
if (msw1 != msw0) {
msw0 = msw1;
lsw0 = lsw1;
}
return (msw0 << 32) | lsw0;
becomes
mov esi,[COUNTER_BASE_ADDRESS]
mov eax,[esi+LSW_OFFSET]
mov edx,[esi+MSW_OFFSET]
mov ebx,[esi+LSW_OFFSET]
mov ecx,[esi+MSW_OFFSET]
cmp edx,ecx
cmovne eax,ebx
cmovne edx,ecx
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching" |
|
| Back to top |
|
 |
Alex Colvin
Guest
|
Posted:
Fri Jul 15, 2005 9:32 pm Post subject:
Re: stalling the TSC? |
|
|
| Quote: | Planck time could easily fit in a 256-bit counter, thus putting paid to
all of those pesky Y[0-9]* bugs (and especially Y2038!) once and for all!!
But not the year 10^28 bug! We really should have learnt that it
is necessary to think ahead ....
|
watch out for proton decay.
--
mac the naïf |
|
| Back to top |
|
 |
Alex Colvin
Guest
|
Posted:
Fri Jul 15, 2005 9:34 pm Post subject:
Re: stalling the TSC? |
|
|
| Quote: | Well, there's a standard workaround for that one that goes back to
the ancient days of clock chips on 8-bit busses, which is to poll
the MSW, the LSW, and the MSW again (in that order) until the two
MSW values are the same, e.g., in sloppy C:
No, it is not good, since it leads to non-deterministic counter read
latencies. :-(
|
You should to be able to pull it off with three reads, returning a time
that actually occurred sometime during the reads. If the MSW wrapped, the
LSW passed through 0.
If you can be preempted long enough for the LSW to wrap, any LSW will do.
Watch out for proton decay, though.
--
mac the naïf |
|
| Back to top |
|
 |
Terje Mathisen
Guest
|
Posted:
Sat Jul 16, 2005 12:15 am Post subject:
Re: stalling the TSC? |
|
|
Alex Colvin wrote:
| Quote: | Well, there's a standard workaround for that one that goes back to
the ancient days of clock chips on 8-bit busses, which is to poll
the MSW, the LSW, and the MSW again (in that order) until the two
MSW values are the same, e.g., in sloppy C:
No, it is not good, since it leads to non-deterministic counter read
latencies. :-(
You should to be able to pull it off with three reads, returning a time
that actually occurred sometime during the reads. If the MSW wrapped, the
LSW passed through 0.
|
This works, but I'd rather use a second LSW value, otherwise you'll get
too many LSW == 0 samples.
| Quote: |
If you can be preempted long enough for the LSW to wrap, any LSW will do.
Watch out for proton decay, though.
|
:-)
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching" |
|
| Back to top |
|
 |
|
|
|
|