John Doe
Guest
|
Posted:
Tue Aug 02, 2005 11:57 pm Post subject:
Idle Thread And Processor Stalls |
|
|
I am a PhD student in CS. One of the things I am trying to do in my
experiments is measure the percentage of time that the processor stalls
waiting for memory or disk accesses to complete in a program. (I'm using
this info in some algorithms I'm developing in my thesis.) I have some
doubts about the accuracy of a specific metric for measuring this and was
hoping to get some feedback.
I'm running Linux 2.6.12 on a Dell Centrino Laptop (with Pentium M). The
processor has a performance counter event called CLK_CYCLES_UNHALTED, which
allows you to measure the number of cycles that the processor spends in the
unhalted state. Whenever Linux enters the idle thread it automatically halts
the processor (through the hlt instruction) and you can see the effect
through this event.
I am exploiting this event to detect the fraction of time that the processor
is idle for any reason (memory stalls etc.). Specifically, for a program, I
am computing,
1- (unhalted cycles/total cycles)
to get the percentage of time that the cpu is halted. I assume that this
roughly equal to the percentage of time that the cpu is idle during the
program, since Linux halts the processor in the idle thread.
To test this metric, I wrote a memory bound program that traverses a large
array. I find that for this program the unhalted cycles is very small
compared to the total cycles, indicating the cpu spends most of its time in
the halt state. When I instead run a more cpu intensive program , I find
that the unhalted cycles is closer to the total cycles.
But the problem I see is this. When the processor becomes idle there must be
some latency before Linux switches over to the idle thread and issues the
halt instruction.
If the processor were to stall in the middle of the current thread, waiting
for a memory or disk access to complete, how long would it take before the
OS issues the halt instruction? It seems to me that if it takes too long,
then my metric above won't be able to detect very small memory stalls. |
|
Guest
|
Posted:
Wed Aug 03, 2005 12:15 am Post subject:
Re: Idle Thread And Processor Stalls |
|
|
John Doe> I am a PhD student in CS.
The Pentium M, like most CPUs, does not perform a context switch on a
cache miss to main memory. This is different from the Pentium 4, with
SMT (aka hyperthreading) enabled, which is supposed to switch between
the two hardware threads during some cache misses, including an L2
miss out to main memory.
With no context switch, the Linux kernel will not get a chance to HLT
the processor on a main memory stall. Memory stalls are completely
different from disk stalls. A disk stall will often be triggered by
a page fault or system call, and of course the Linux kernel has control
to do a context switch after scheduling the I/O.
Surely the lack of context switch on main memory stall came up during
your experiment design, when you reviewed it with your advisor?
JD> To test this metric, I wrote a memory bound program that traverses
JD> a large array. I find that for this program the unhalted cycles is
JD> very small compared to the total cycles, indicating the cpu spends
JD> most of its time in the halt state.
Maybe it gets into that state by some other means than the HLT
instruction. Perhaps the state measured is not what you think
it is. Or perhaps the "memory bound" program is actually swapping,
and so disk bound rather than memory bound.
JD> But the problem I see is this. When the processor becomes idle
JD> there must be some latency before Linux switches over to the idle
JD> thread and issues the halt instruction.
This is a pretty subtle point you are making, on a really wrong
foundation. |
|