| Author |
Message |
Terje Mathisen
Guest
|
Posted:
Sat Sep 10, 2005 3:22 pm Post subject:
Re: interrupting for overflow and loop termination |
|
|
andrewspencers@yahoo.com wrote:
| Quote: | My argument is essentially this:
Interrupt-triggered events are better than polling-triggered events,
assuming that the cost of setting up the interrupt is less than the
total cost of all the polling repetitions.
|
The cost of setting up _and_ the cost of actually taking it!
| Quote: | Compare-and-branch inside a loop is a poll of the loop-exiting
condition.
Thus for a high-repetition loop, an interrupt-triggered exit is better
than a compare-and-branch exit.
|
It is _not_ better:
a) If the loop count is high, the loop branch will always be correctly
predicted, except for the loop exit, right?
b) If the loop count is stored in a fixed register, like ECX for an x86
LOOP instruction, or otherwise designated, then the number of branch
misses should be zero instead of one.
c) The loop count register is often used inside the loop anyway, so you
must use alu resources to update it.
d) How do you propose to generate loops unless you have some kind of (in
your case unconditional) branch at the bottom?
e) Modern cpus are very seldom integer alu resource constrained, so even
when you have no other need for the loop count register, the actual
update is effectively free.
OTOH, ints _can_ be a win in other situations, specifically when they
are used to guard against/handle really exceptional situations, where
the extra overhead of a taken interrupts can be amortized over many runs
when nothing happens.
Noting my argument (e), the overhead of inserting INTO opcodes at
relevant sites in the code can be close to zero, since the cpu can treat
it like a "strongly predicted to fall through" branch, i.e. it doesn't
even need to take up any branch predictor buffer space.
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching" |
|
| Back to top |
|
 |
Guest
|
Posted:
Sun Sep 11, 2005 7:10 am Post subject:
Re: interrupting for overflow and loop termination |
|
|
Terje Mathisen wrote:
| Quote: | a) If the loop count is high, the loop branch will always be correctly
predicted, except for the loop exit, right?
Right. |
| Quote: | c) The loop count register is often used inside the loop anyway, so you
must use alu resources to update it.
Yes, but I wasn't arguing against using a loop count register or |
against updating it inside the loop; I was arguing against including
instructions inside the loop which conditionally branch based on the
value in the register. I wanted the processor to watch the register and
trigger an interrupt when it reached some target value, the same way
that the processor watches e.g. the program counter and triggers an
interrupt when it matches the value in a breakpoint register, with zero
impact on program execution speed until the interrupt is triggered.
| Quote: | d) How do you propose to generate loops unless you have some kind of (in
your case unconditional) branch at the bottom?
Yes, I proposed using an unconditional branch at the bottom. |
| Quote: | OTOH, ints _can_ be a win in other situations, specifically when they
are used to guard against/handle really exceptional situations, where
the extra overhead of a taken interrupts can be amortized over many runs
when nothing happens.
For example, the really exceptional situation of exiting a |
high-repetition loop, where the extra overhead of the interrupt can be
amortized over the many repetitions when the interrupt isn't triggered?
I.e. isn't what you wrote here actually an argument in favor of my
proposal?
But actually now I agree that your argument is correct that my proposal
is useless for normal loops, but can be useful for other situations.
But I think you mischaracterized those other situations; ints aren't
useful just because they're handling really exceptional situations, but
because they're handling _a large amount_ of exceptional situations
inside the loop. Specifically, exiting a high-repeition loop is an
exceptional situation, but the cache, branch prediction, and
speculative execution features combine to make a single test-branch (or
possibly even several of them) inside the loop effectively free, so my
interrupt proposal wouldn't gain anything. But if there are enough
test-branches inside the loop (especially if there are a bunch of inner
loops inside the loop), then the speculative execution circuitry and
possibly even the cache can get overloaded, such that test-branch
instructions are no longer free. In _this_ case, my interrupt proposal
would be useful.
And one such case is a loop (possibly a big loop with a lot of inner
loops) with a large number of integer operations each of which must
test for and handle overflows, which is generally the case with bignum
(infinite precision integer) arithmetic, which is the particular case
which prompted me to start this thread in the first place.
Another case would be a linear search for which the target string is
generally very long and there are a lot of partial matches within the
search space.
Another case, I think (but I'm too tired right now to think clearly
enough to be sure), would be a relational join algorithm.
Is my analysis accurate?
| Quote: | Noting my argument (e), the overhead of inserting INTO opcodes at
relevant sites in the code can be close to zero, since the cpu can treat
it like a "strongly predicted to fall through" branch, i.e. it doesn't
even need to take up any branch predictor buffer space.
But I'm not proposing using INTO (a soft interrupt); I'm proposing |
using a hard interrupt, which works the same way that breakpointing
works. |
|
| Back to top |
|
 |
David Hopwood
Guest
|
Posted:
Sun Sep 11, 2005 9:39 pm Post subject:
Re: interrupting for overflow and loop termination |
|
|
andrewspencers@yahoo.com wrote:
| Quote: | But actually now I agree that your argument is correct that my proposal
is useless for normal loops, but can be useful for other situations.
But I think you mischaracterized those other situations; ints aren't
useful just because they're handling really exceptional situations, but
because they're handling _a large amount_ of exceptional situations
inside the loop. Specifically, exiting a high-repetition loop is an
exceptional situation, but the cache, branch prediction, and
speculative execution features combine to make a single test-branch (or
possibly even several of them) inside the loop effectively free, so my
interrupt proposal wouldn't gain anything. But if there are enough
test-branches inside the loop (especially if there are a bunch of inner
loops inside the loop), then the speculative execution circuitry and
possibly even the cache can get overloaded, such that test-branch
instructions are no longer free. In _this_ case, my interrupt proposal
would be useful.
|
If the speculative execution circuitry is "overloaded", then the same is
likely to be true of the interrupt checking under the same circumstances.
Checking for interrupts (which then need to be delivered precisely) is
not free; in terms of complexity it is similar to speculative execution.
That is, you strongly speculate that the interrupt is not taken on each
instruction, and have to recover if it is. The difference is that you need
dedicated hardware resources to check the interrupt conditions on *every*
instruction, rather than using shared resources to do it only when the
condition is relevant.
(Also, although this would be a fairly minor issue: interrupts used for
this purpose add to the state that has to be saved and restored on context
switches.)
--
David Hopwood <david.nospam.hopwood@blueyonder.co.uk> |
|
| Back to top |
|
 |
Guest
|
Posted:
Mon Sep 12, 2005 12:15 am Post subject:
Re: interrupting for overflow and loop termination |
|
|
David Hopwood wrote:
| Quote: | If the speculative execution circuitry is "overloaded", then the same is
likely to be true of the interrupt checking under the same circumstances.
Checking for interrupts (which then need to be delivered precisely) is
not free; in terms of complexity it is similar to speculative execution.
That is, you strongly speculate that the interrupt is not taken on each
instruction, and have to recover if it is. The difference is that you need
dedicated hardware resources to check the interrupt conditions on *every*
instruction, rather than using shared resources to do it only when the
condition is relevant.
Yes, that makes sense. But then in that case, when you wrote in your |
prior message:
| Quote: | OTOH, ints _can_ be a win in other situations, specifically when they
are used to guard against/handle really exceptional situations, where
the extra overhead of a taken interrupts can be amortized over many runs
when nothing happens.
Which such situations did you have in mind? |
|
|
| Back to top |
|
 |
David Hopwood
Guest
|
Posted:
Mon Sep 12, 2005 12:15 am Post subject:
Re: interrupting for overflow and loop termination |
|
|
andrewspencers@yahoo.com wrote:
| Quote: | David Hopwood wrote:
If the speculative execution circuitry is "overloaded", then the same is
likely to be true of the interrupt checking under the same circumstances.
Checking for interrupts (which then need to be delivered precisely) is
not free; in terms of complexity it is similar to speculative execution.
That is, you strongly speculate that the interrupt is not taken on each
instruction, and have to recover if it is. The difference is that you need
dedicated hardware resources to check the interrupt conditions on *every*
instruction, rather than using shared resources to do it only when the
condition is relevant.
Yes, that makes sense. But then in that case, when you wrote in your
prior message:
OTOH, ints _can_ be a win in other situations, specifically when they
are used to guard against/handle really exceptional situations, where
the extra overhead of a taken interrupts can be amortized over many runs
when nothing happens.
Which such situations did you have in mind?
|
I didn't write that; Terje Mathisen did.
--
David Hopwood <david.nospam.hopwood@blueyonder.co.uk> |
|
| Back to top |
|
 |
David Hopwood
Guest
|
Posted:
Mon Sep 12, 2005 12:15 am Post subject:
Re: interrupting for overflow and loop termination |
|
|
David Hopwood wrote:
| Quote: | andrewspencers@yahoo.com wrote:
[...] when you wrote in your prior message:
OTOH, ints _can_ be a win in other situations, specifically when they
are used to guard against/handle really exceptional situations, where
the extra overhead of a taken interrupts can be amortized over many runs
when nothing happens.
Which such situations did you have in mind?
I didn't write that; Terje Mathisen did.
|
But to answer the question, some examples of where this approach can be
effective are using unmapped guard pages to detect stack overflows, exhaustion
of an allocation space, or null pointer accesses in languages that require
null pointer checks.
Note that these are cases in which the relevant operation (allocating a stack
frame or object, indirecting through a pointer) is extremely common, and the
interrupt-based optimization is in "infrastructure" code so that a single
implementation can apply to many programs. Even then it may not pay off: hacks
like this should be reevaluated every so often to see whether they are really
giving a performance improvement that justifies their complexity.
--
David Hopwood <david.nospam.hopwood@blueyonder.co.uk> |
|
| Back to top |
|
 |
glen herrmannsfeldt
Guest
|
Posted:
Mon Sep 12, 2005 8:15 am Post subject:
Re: interrupting for overflow and loop termination |
|
|
andrewspencers@yahoo.com wrote:
(snip)
| Quote: | Yes, but I wasn't arguing against using a loop count register or
against updating it inside the loop; I was arguing against including
instructions inside the loop which conditionally branch based on the
value in the register. I wanted the processor to watch the register and
trigger an interrupt when it reached some target value, the same way
that the processor watches e.g. the program counter and triggers an
interrupt when it matches the value in a breakpoint register, with zero
impact on program execution speed until the interrupt is triggered.
|
Why would it have zero impact?
Note that S/370 PER, which has breakpoints triggered by an
instruction within a range of addresses, clearly states that there
may be a performance penalty. The logic is still there, instruction
or not, and still takes time.
-- glen |
|
| Back to top |
|
 |
glen herrmannsfeldt
Guest
|
Posted:
Mon Sep 12, 2005 8:15 am Post subject:
Re: interrupting for overflow and loop termination |
|
|
Terje Mathisen wrote:
(snip)
| Quote: | It is _not_ better:
a) If the loop count is high, the loop branch will always be correctly
predicted, except for the loop exit, right?
b) If the loop count is stored in a fixed register, like ECX for an x86
LOOP instruction, or otherwise designated, then the number of branch
misses should be zero instead of one.
|
I wonder if any change the prediction based on ECX? It would be nice
to know maybe one loop cycle in advance, depending on prefetch buffer
size and loop size.
-- glen |
|
| Back to top |
|
 |
glen herrmannsfeldt
Guest
|
Posted:
Mon Sep 12, 2005 8:15 am Post subject:
Re: interrupting for overflow and loop termination |
|
|
Nick Maclaren wrote:
(snip)
| Quote: | The point is that sloppy programmers often use signed arithmetic
when they should be using unsigned. On a twos complement machine,
ignoring overflow on signed arithmetic is almost equivalent to
using unsigned arithmetic, so they get away with it. Until someone
switches on overflow trapping ....
|
I remember when I was learning PDP-10 assembler, and considering
translating a S/360 multiple precision arithmetic package, asking
someone how to do unsigned arithmetic. This was the first machine
I knew of with the negative, zero, overflow, carry, flags.
S/360 uses the condition codes, with four possible values aren't
enough, so both signed and unsigned (logical) add and subtract
are used. Also, add logical won't generate an overflow interrupt.
-- glen |
|
| Back to top |
|
 |
Guest
|
Posted:
Mon Sep 12, 2005 8:15 am Post subject:
Re: interrupting for overflow and loop termination |
|
|
David Hopwood wrote:
| Quote: | I didn't write that; Terje Mathisen did.
Sorry about that! |
I would put on my dunce cap, but I seem to have lost it... |
|
| Back to top |
|
 |
glen herrmannsfeldt
Guest
|
Posted:
Mon Sep 12, 2005 8:15 am Post subject:
Re: interrupting for overflow and loop termination |
|
|
David Hopwood wrote:
(snip regarding interrupt loop termination)
| Quote: | If the speculative execution circuitry is "overloaded", then the same is
likely to be true of the interrupt checking under the same circumstances.
Checking for interrupts (which then need to be delivered precisely) is
not free; in terms of complexity it is similar to speculative execution.
That is, you strongly speculate that the interrupt is not taken on each
instruction, and have to recover if it is. The difference is that you need
dedicated hardware resources to check the interrupt conditions on *every*
instruction, rather than using shared resources to do it only when the
condition is relevant.
|
Well, how about an imprecise interrupt, where possibly one or more
additional loop cycles will be executed. At interrupt time the program
has to correct for the additional cycles. That might make it faster,
but then it could also be done with a normal loop instruction.
-- glen |
|
| Back to top |
|
 |
glen herrmannsfeldt
Guest
|
Posted:
Mon Sep 12, 2005 8:15 am Post subject:
Re: interrupting for overflow and loop termination |
|
|
David Hopwood wrote:
| Quote: | But to answer the question, some examples of where this approach can be
effective are using unmapped guard pages to detect stack overflows,
exhaustion of an allocation space, or null pointer accesses in
languages that require null pointer checks.
Note that these are cases in which the relevant operation
(allocating a stack frame or object, indirecting through a pointer)
is extremely common, and the interrupt-based optimization is in
"infrastructure" code so that a single implementation can apply
to many programs. Even then it may not pay off: hacks
like this should be reevaluated every so often to see
whether they are really giving a performance improvement
that justifies their complexity.
|
They should, but a protected mode system must check for invalid
addresses, and a virtual memory system must check for a valid
page table entry.
-- glen |
|
| Back to top |
|
 |
chl
Guest
|
Posted:
Mon Sep 12, 2005 8:15 am Post subject:
Re: interrupting for overflow and loop termination |
|
|
<andrewspencers@yahoo.com> wrote in message news:1126320161.417289.140570@z14g2000cwz.googlegroups.com...
| Quote: | Thus for a high-repetition loop, an interrupt-triggered exit is better
than a compare-and-branch exit.
|
In your scenario, what is going to happen when an "interrupt exited" loop
wants to call another "interrupt exited" loop?
> |
|
| Back to top |
|
 |
Terje Mathisen
Guest
|
Posted:
Mon Sep 12, 2005 8:15 am Post subject:
Re: interrupting for overflow and loop termination |
|
|
andrewspencers@yahoo.com wrote:
| Quote: | Terje Mathisen wrote:
d) How do you propose to generate loops unless you have some kind of (in
your case unconditional) branch at the bottom?
Yes, I proposed using an unconditional branch at the bottom.
|
This costs _exactly_ as much as a conditional jump, assuming both are
correctly predicted to be taken. (No, it isn't true that unconditional
branches are free.)
The exit comparison logic has to be there anyway, either you run it
after/during every single clock cycle, or you just do it once, when the
CMP reg,exit_value opcodes turns up.
| Quote: | Another case would be a linear search for which the target string is
generally very long and there are a lot of partial matches within the
search space.
|
I have written code like that, and it is indeed possible to get into
situations where you have a _lot_ of exit tests. However, since these
are all different, a "silent/parallel/interrupt-based" version of it
wouldn't work at all, unless you could setup every single one of these
tests in a list of stuff to be monitored.
| Quote: | Another case, I think (but I'm too tired right now to think clearly
enough to be sure), would be a relational join algorithm.
Is my analysis accurate?
|
No.
| Quote: |
Noting my argument (e), the overhead of inserting INTO opcodes at
relevant sites in the code can be close to zero, since the cpu can treat
it like a "strongly predicted to fall through" branch, i.e. it doesn't
even need to take up any branch predictor buffer space.
But I'm not proposing using INTO (a soft interrupt); I'm proposing
using a hard interrupt, which works the same way that breakpointing
works.
|
If you had Nick's perfect world where user processes could register to
handle hw exceptions, then your ideas could make a little more sense,
but still only when you have a lot of different code locations that can
share the same logic.
I.e. stack overflow fixups is a good example, loop processing isn't.
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching" |
|
| Back to top |
|
 |
Guest
|
Posted:
Mon Sep 12, 2005 1:50 pm Post subject:
Re: interrupting for overflow and loop termination |
|
|
glen herrmannsfeldt wrote:
| Quote: | David Hopwood wrote:
But to answer the question, some examples of where this approach can be
effective are using unmapped guard pages to detect stack overflows,
exhaustion of an allocation space, or null pointer accesses in
languages that require null pointer checks.
Note that these are cases in which the relevant operation
(allocating a stack frame or object, indirecting through a pointer)
is extremely common, and the interrupt-based optimization is in
"infrastructure" code so that a single implementation can apply
to many programs. Even then it may not pay off: hacks
like this should be reevaluated every so often to see
whether they are really giving a performance improvement
that justifies their complexity.
They should, but a protected mode system must check for invalid
addresses, and a virtual memory system must check for a valid
page table entry.
I assumed that David was referring to a non-protected mode system, in |
which a program is responsible for checking itself.
True, in a protected mode system, with the OS responsible for checking
the program, interrupt-based checking is necessary in order to operate
the virtual memory system, but from the program's perspective, the
virtual memory system doesn't exist, and the associated interrupts are
invisible. In contemporary systems programs don't have arbitrary access
even to their own address spaces, but in principle they could (with
access attempts to unallocated pages simply resulting in the system
allocating them, so that the program sees its own entire address space
as preallocated memory), so that from the system's perspective, those
programs are incapable of making any mistakes. In that case, David's
remarks would apply even for a program running in a protected mode
system; the program would be responsible for checking its own null
pointers, stack overflows, internally-allocated buffer overruns, etc,
either by test-branch code or by some theoretical user-mode interrupt
system, since the OS wouldn't know or care if the program screwed
itself up. |
|
| Back to top |
|
 |
|
|
|
|