| Author |
Message |
karl malbrain
Guest
|
Posted:
Mon Nov 29, 2004 12:20 am Post subject:
16K pentium level one cache |
|
|
Does anyone know the organization of the Pentium level one cache? I
need to fit a consecutive byte array into it. What is the maximum
size available? Thanks, karl m |
|
| Back to top |
|
 |
Grumble
Guest
|
Posted:
Mon Nov 29, 2004 2:49 pm Post subject:
Re: 16K pentium level one cache |
|
|
karl malbrain wrote:
| Quote: | Does anyone know the organization of the Pentium level one cache? I
need to fit a consecutive byte array into it. What is the maximum
size available?
|
Pentium classic or Pentium MMX?
CPU-Z might turn up some useful information.
http://www.cpuid.com/cpuz.php
--
Regards, Grumble |
|
| Back to top |
|
 |
Guest
|
Posted:
Mon Nov 29, 2004 7:05 pm Post subject:
Re: 16K pentium level one cache |
|
|
| Both, but what exactly changed with the MMX update? karl m |
|
| Back to top |
|
 |
Anton Ertl
Guest
|
|
| Back to top |
|
 |
Tim Christensen
Guest
|
Posted:
Mon Nov 29, 2004 7:54 pm Post subject:
Re: 16K pentium level one cache |
|
|
karl_m@acm.org wrote:
| Quote: | Both, but what exactly changed with the MMX update? karl m
|
The MMX update has twice the L1 cache of the vanilla Pentium.
From sandpile.org:
Pentium (P5)
Level 1
Code
8 KB, 2-Way, 32 Byte/Line, SI,
2x Fetch Port (supports Split-line Acess),
Snoop Port (for SMC), LRU
Data
8 KB, 2-Way, 32 Byte/Line, MESI,
Non-blocking, Dual-ported, Snoop Port,
8 Banks, LRU
Pentium MMX (P55)
Level 1
Code
16 KB, 4-Way, 32 Byte/Line, SI,
Fetch Port (no Split-line Access Support),
Snoop Port (for SMC), LRU
Data
16 KB, 4-Way, 32 Byte/Line, MESI,
Non-Blocking, Dual-ported, Snoop Port,
8 Banks, LRU |
|
| Back to top |
|
 |
Grumble
Guest
|
|
| Back to top |
|
 |
Guest
|
Posted:
Mon Nov 29, 2004 9:54 pm Post subject:
Re: 16K pentium level one cache |
|
|
Grumble wrote:
| Quote: | karl_m@acm.org wrote:
Both, but what exactly changed with the MMX update?
You might have snipped just a bit too much context ;-)
|
Sorry. We're having trouble with Advanced Encryption Standard
implementations on the Pentium. We have a 1024 byte table that we need
to access in constant time without bank conflict stalls. What is the
size of the bank?
Thanks, karl m |
|
| Back to top |
|
 |
Guest
|
Posted:
Mon Nov 29, 2004 10:07 pm Post subject:
Re: 16K pentium level one cache |
|
|
Tim Christensen wrote:
| Quote: | karl_m@acm.org wrote:
Both, but what exactly changed with the MMX update? karl m
The MMX update has twice the L1 cache of the vanilla Pentium.
From sandpile.org:
Pentium MMX (P55)
Level 1
Data
16 KB, 4-Way, 32 Byte/Line, MESI,
Non-Blocking, Dual-ported, Snoop Port,
8 Banks, LRU
|
Does this mean that 2 KB of consecutive address space is available? Do
the ASSOCIATIVITY bits come into play at this level? Thanks, karl m |
|
| Back to top |
|
 |
Guest
|
Posted:
Mon Nov 29, 2004 10:08 pm Post subject:
Re: 16K pentium level one cache |
|
|
Anton Ertl wrote:
| Quote: | karl_m@acm.org (karl malbrain) writes:
Does anyone know the organization of the Pentium level one cache?
8KB I + 8KB D write-back, IIRC 4-way set-associative.
- anton
|
Do the 2 bits of set-associatiation come out of the 13 bits of address?
karl m |
|
| Back to top |
|
 |
Grumble
Guest
|
Posted:
Tue Nov 30, 2004 2:23 pm Post subject:
Re: 16K pentium level one cache |
|
|
karl_m@acm.org wrote:
| Quote: | We're having trouble with [AES] implementations on the Pentium.
We have a 1024 byte table that we need to access in constant time
without bank conflict stalls. What is the size of the bank?
|
Unfortunately, I have never optimized code for the P5.
Perhaps the Intel manual can help?
http://intel.com/design/intarch/manuals/242816.htm
2.1.2 Caches (Pentium)
The on-chip cache subsystem consists of two 8-Kbyte two-way set
associative caches (one instruction and one data) with a cache line
length of 32 bytes. There is a 64-bit wide external data bus interface.
The caches employ a write back mechanism and an LRU replacement
algorithm. The data cache consists of eight banks interleaved on four
byte boundaries. The data cache can be accessed simultaneously from both
pipes, as long as the references are to different banks. The minimum
delay for a cache miss is four clocks.
2.3.3 Caches (Pentium MMX)
The on-chip cache subsystem of Pentium processors with MMX technology
and Pentium II processors consists of two 16 Kbyte four-way set
associative caches with a cache line length of 32 bytes. The caches
employ a write-back mechanism and a pseudo-LRU replacement algorithm.
The data cache consists of eight banks interleaved on four-byte boundaries.
On Pentium processors with MMX technology, the data cache can be
accessed simultaneously from both pipes, as long as the references are
to different cache banks. On the P6-family processors, the data cache
can be accessed simultaneously by a load instruction and a store
instruction, as long as the references are to different cache banks. If
the references are to the same address they bypass the cache and are
executed in the same cycle. The delay for a cache miss on the Pentium
processor with MMX technology is eight internal clock cycles. On Pentium
II processors the minimum delay is ten internal clock cycles.
Have you ever played with VTune?
--
Regards, Grumble |
|
| Back to top |
|
 |
Anton Ertl
Guest
|
Posted:
Tue Nov 30, 2004 2:32 pm Post subject:
Re: 16K pentium level one cache |
|
|
karl_m@acm.org writes:
| Quote: |
Anton Ertl wrote:
karl_m@acm.org (karl malbrain) writes:
Does anyone know the organization of the Pentium level one cache?
8KB I + 8KB D write-back, IIRC 4-way set-associative.
|
It's probably 2-way set associative.
| Quote: | Do the 2 bits of set-associatiation come out of the 13 bits of address?
|
Which 13 bits? Each way is addressed with bits 5-11 (bits 0-4 are for
addressing within the line).
- anton
--
M. Anton Ertl Some things have to be seen to be believed
anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html |
|
| Back to top |
|
 |
Guest
|
Posted:
Tue Nov 30, 2004 9:27 pm Post subject:
Re: 16K pentium level one cache |
|
|
Grumble wrote:
| Quote: | karl_m@acm.org wrote:
We're having trouble with [AES] implementations on the Pentium.
We have a 1024 byte table that we need to access in constant time
without bank conflict stalls. What is the size of the bank?
Unfortunately, I have never optimized code for the P5.
Perhaps the Intel manual can help?
http://intel.com/design/intarch/manuals/242816.htm
2.1.2 Caches (Pentium)
The on-chip cache subsystem consists of two 8-Kbyte two-way set
associative caches (one instruction and one data) with a cache line
length of 32 bytes. There is a 64-bit wide external data bus
interface.
The caches employ a write back mechanism and an LRU replacement
algorithm. The data cache consists of eight banks interleaved on four
byte boundaries. The data cache can be accessed simultaneously from
both
pipes, as long as the references are to different banks. The minimum
delay for a cache miss is four clocks.
2.3.3 Caches (Pentium MMX)
The on-chip cache subsystem of Pentium processors with MMX technology
and Pentium II processors consists of two 16 Kbyte four-way set
associative caches with a cache line length of 32 bytes. The caches
employ a write-back mechanism and a pseudo-LRU replacement algorithm.
The data cache consists of eight banks interleaved on four-byte
boundaries.
On Pentium processors with MMX technology, the data cache can be
accessed simultaneously from both pipes, as long as the references
are
to different cache banks. On the P6-family processors, the data cache
can be accessed simultaneously by a load instruction and a store
instruction, as long as the references are to different cache banks.
If
the references are to the same address they bypass the cache and are
executed in the same cycle. The delay for a cache miss on the Pentium
processor with MMX technology is eight internal clock cycles. On
Pentium
II processors the minimum delay is ten internal clock cycles.
Have you ever played with VTune?
|
Thanks. THe magic phrase would appear to be INTERLEAVED ON FOUR BYTE
BOUNDARIES. So we have from bit ZERO: 2 bits of byte address, 3 bits
of bank selector, and (13/14 - 5) remaining bits.
What's VTune? karl m |
|
| Back to top |
|
 |
Guest
|
Posted:
Tue Nov 30, 2004 9:29 pm Post subject:
Re: 16K pentium level one cache |
|
|
Anton Ertl wrote:
| Quote: | karl_m@acm.org writes:
Anton Ertl wrote:
karl_m@acm.org (karl malbrain) writes:
Does anyone know the organization of the Pentium level one cache?
8KB I + 8KB D write-back, IIRC 4-way set-associative.
It's probably 2-way set associative.
Do the 2 bits of set-associatiation come out of the 13 bits of
address?
Which 13 bits? Each way is addressed with bits 5-11 (bits 0-4 are
for
addressing within the line).
I'm interested in bank interleaving. Thanks, karl m |
|
|
| Back to top |
|
 |
Grumble
Guest
|
|
| Back to top |
|
 |
Sander Vesik
Guest
|
Posted:
Tue Nov 30, 2004 10:49 pm Post subject:
Re: 16K pentium level one cache |
|
|
karl_m@acm.org wrote:
| Quote: |
Anton Ertl wrote:
karl_m@acm.org (karl malbrain) writes:
Does anyone know the organization of the Pentium level one cache?
8KB I + 8KB D write-back, IIRC 4-way set-associative.
- anton
Do the 2 bits of set-associatiation come out of the 13 bits of address?
|
Huh? What do you mean? The address length depends on the line size and
is effctively transparent to anybody but chip designer.
--
Sander
+++ Out of cheese error +++ |
|
| Back to top |
|
 |
|
|
|
|