| Author |
Message |
Frode Vatvedt Fjeld
Guest
|
Posted:
Fri Nov 26, 2004 4:06 pm Post subject:
Are cache-lines dirtied by writing the same value? |
|
|
When programming a garbage collector, I find myself sequentially
scanning large blocks of memory while reading and updating many (if
not all) locations. However, often (roughly 95%) the updated value
will be identical to the old one. My question is whether I should
avoid the memory write instruction or not, taking the
compare-and-branch hit instead.
That is, each location is first read, presumably causing that
cache-line to be loaded. Then, when a value is written back to that
location, I presume the cache-line is marked dirty, eventually causing
some traffic on the memory bus when the cache-line is evicted and
written back to main memory. Can I assume that the cache sub-system is
clever enough to detect if a cached memory write has no real effect
(i.e. if the new value is the same as the old) and then not mark the
cache line dirty (and so not cause memory-bus traffic later on), or
would I be better advised to have my program compare the values and
skip the memory write so as to reduce pressure on the memory bus?
My platform is primarily recent x86 (32-bit) systems, although I'd be
interested also to hear about the range of 32-bit x86 systems from 386
and up. Paging etc. is not a concern here.
Thanks,
--
Frode Vatvedt Fjeld |
|
| Back to top |
|
 |
Terje Mathisen
Guest
|
Posted:
Fri Nov 26, 2004 5:16 pm Post subject:
Re: Are cache-lines dirtied by writing the same value? |
|
|
Frode Vatvedt Fjeld wrote:
| Quote: | When programming a garbage collector, I find myself sequentially
scanning large blocks of memory while reading and updating many (if
not all) locations. However, often (roughly 95%) the updated value
will be identical to the old one. My question is whether I should
avoid the memory write instruction or not, taking the
compare-and-branch hit instead.
That is, each location is first read, presumably causing that
cache-line to be loaded. Then, when a value is written back to that
location, I presume the cache-line is marked dirty, eventually causing
some traffic on the memory bus when the cache-line is evicted and
written back to main memory. Can I assume that the cache sub-system is
clever enough to detect if a cached memory write has no real effect
(i.e. if the new value is the same as the old) and then not mark the
cache line dirty (and so not cause memory-bus traffic later on), or
would I be better advised to have my program compare the values and
skip the memory write so as to reduce pressure on the memory bus?
|
Hei Frode!
If most of the values are identical, then by all means do a regular
if (old_val != new_val)
*mem = new_val;
However, if the number of updates is significant, and pretty much
unpredictable, i.e. the branch target buffer misses are significant,
then I'd look into a conditional redirect of the write operation:
old_val = mem;
.... lots of stuff here
target = mem;
if (old_val == new_val)
target = &scratch_buffer; // local variable, allocated on stack
*target = new_val;
The reason for this rewrite is that it makes it possible for the
compiler to use a conditional move to update the target pointer:
; edi -> [mem]
; eax = new_val to be written back
; edx = old_val i.e. previous value
lea esi,[scratch_buffer] ; Scratchpad address
cmp eax,edx ; old == new ?
cmove edi,esi ; ESI->EDI if equal
mov [edi],eax ; DO the writeback
The idea is that a bunch of (not-updated) writes will go to a single
location which will stay in L1 cache, and where write coalescing will
allow most of the writes to never actually cause any memory traffic.
Time the alternatives!
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching" |
|
| Back to top |
|
 |
Frode Vatvedt Fjeld
Guest
|
Posted:
Fri Nov 26, 2004 6:31 pm Post subject:
Re: Are cache-lines dirtied by writing the same value? |
|
|
Terje Mathisen <terje.mathisen@hda.hydro.com> writes:
| Quote: | If most of the values are identical, then by all means do a regular
if (old_val != new_val)
*mem = new_val;
|
Hei Terje. Ok, if you say so.. Although of course my actual code
looks like this (-:
(unless (eq old-value new-value)
(setf (memref location 0) new-value))
| Quote: | [..] The reason for this rewrite is that it makes it possible for
the compiler to use a conditional move to update the target pointer:
; edi -> [mem]
; eax = new_val to be written back
; edx = old_val i.e. previous value
lea esi,[scratch_buffer] ; Scratchpad address
cmp eax,edx ; old == new ?
cmove edi,esi ; ESI->EDI if equal
mov [edi],eax ; DO the writeback
The idea is that a bunch of (not-updated) writes will go to a single
location which will stay in L1 cache, and where write coalescing will
allow most of the writes to never actually cause any memory traffic.
|
This is a good idea, thanks! I was considering cmove, but thought I
remembered that a cmove to a memory location performs the memory
transaction regardless. Your technique here of conditionalizing the
memory location solves this neatly.
I'm in fact writing the compiler as well as the garbage collector
myself, so there's no need to rely on weird C idioms.. ;)
| Quote: | Time the alternatives!
|
Of course. In fact my next TODO is a profiler. Thanks again for your
input.
--
Frode Vatvedt Fjeld |
|
| Back to top |
|
 |
Iain McClatchie
Guest
|
Posted:
Sat Nov 27, 2004 1:02 am Post subject:
Re: Are cache-lines dirtied by writing the same value? |
|
|
Google for Dead Store Elimination.
Bottom line: modern CPUs don't do this yet. |
|
| Back to top |
|
 |
Anton Ertl
Guest
|
Posted:
Mon Nov 29, 2004 7:27 pm Post subject:
Re: Are cache-lines dirtied by writing the same value? |
|
|
Frode Vatvedt Fjeld <frodef@cs.uit.no> writes:
| Quote: | When programming a garbage collector, I find myself sequentially
scanning large blocks of memory while reading and updating many (if
not all) locations. However, often (roughly 95%) the updated value
will be identical to the old one. My question is whether I should
avoid the memory write instruction or not, taking the
compare-and-branch hit instead.
|
These dead stores will dirty the cache line, and cause it to be
written back later. I would try out both approaches to see what is
faster.
| Quote: | My platform is primarily recent x86 (32-bit) systems, although I'd be
interested also to hear about the range of 32-bit x86 systems from 386
and up. Paging etc. is not a concern here.
|
For CPUs with write-through L1 caches, off-chip L2 caches and fast
clocks (e.g., 486DX2, microSPARC II, 21164PC), the performance can
easily become store-bandwidth limited. For these cases, avoid store
at all costs. Even for CPUs that normally do write-back caching, this
can play a role: we had a K6-2 550 box that performed 2-5 times slower
than another box with a similar processor and a different board
because the BIOS disabled write-back caching for L1 to work around a
motherboard (Asus P5A-B board) bug.
- anton
--
M. Anton Ertl Some things have to be seen to be believed
anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html |
|
| Back to top |
|
 |
|
|
|
|