David Kanter
Guest
|
Posted:
Sat Dec 10, 2005 8:19 am Post subject:
Re: Do you know the FO4 depth for POWER5? |
|
|
Iain McClatchie wrote:
| Quote: | David,
do you have a FO4 table for other processors, so that we can fill in
the FO4 speed for their processes? I think that would be a pretty
nifty piece of info, one that I'd certainly like to see public.
|
That is a good idea, maybe something I should put up at RWT. I'll
start an excel spread sheet to track this info...I'm sure it should be
available for all the high performance MPUs.
DK |
|
Anton Ertl
Guest
|
Posted:
Sat Dec 10, 2005 9:15 am Post subject:
Re: Do you know the FO4 depth for POWER5? |
|
|
"Iain McClatchie" <iain-3@truecircuits.com> writes:
| Quote: | do you have a FO4 table for other processors, so that we can fill in
the FO4 speed for their processes? I think that would be a pretty
nifty piece of info, one that I'd certainly like to see public.
|
There were a number of papers about pipeline length at the 2002 ISCA.
IIRC (but I am not very confident in my memory here) the 21264 has an
FO4 depth per pipe stage of 20 and the Williamette of 16.
@InProceedings{hartstein&puzak02,
author = {A. Hartstein and Thomas R. Puzak},
title = {The Optimum Pipeline Depth for a Microprocessor},
crossref = {isca02},
pages = {7--13},
annote = {Presents a formula for the performance of a
microprocessor when varying the pipeline length; the
optimum pipeline length can be derived from
this. Unfortunately there are two parameters in the
formulae that depend on the microarchitecture and
the workload, and these parameters cannot be
determined analytically, only empirically. The paper
also presents data from runs of a simulator with
different pipeline lengths, and different (but
hardly specified) workloads. The results match with
curves from the formula (after matching for the
missing parameters). One interesting result was that
the SPEC workloads had a shorter optimum pipeline
length than the other workloads used in the paper.}
}
@InProceedings{hrishikesh+02,
author = {M. S. Hrishikesh and Norman P. Jouppi and Keith
I. Farkas and Doug Burger and Stephen W. Keckler and
Premkishore Shivakumar},
title = {The Optimal Logic Depth per Pipeline Stage is 6 to 8
FO4 Inverter Delays},
crossref = {isca02},
pages = {14--24},
annote = {This paper takes a low-level simulator of the 21264,
varies the number of pipeline stages, uses this to
run a number of workloads (actually only traces from
them), and reports performance results for
them. With a latch overhead of about 2 FO4
inverters, the optimal pipeline stage length is
about 8 FO4 inverters (with work-load-dependent
variations). Discusses various issues involved in
quite some depth. In particular, this paper
discusses how to pipeline the instruction window
design (which has been identified as a bottleneck in
earlier papers).}
}
@InProceedings{sprangle&carmean02,
author = {Eric Sprangle and Doug Carmean},
title = {Increasing Processor Performance by Implementing
Deeper Pipelines},
crossref = {isca02},
pages = {25--34},
url = {http://systems.cs.colorado.edu/ISCA2002/FinalPapers/Deep%20Pipes.pdf},
annote = {This paper starts with the Williamette (Pentium~4)
pipeline and discusses and evaluates changes to the
pipeline length. In particular, it gives numbers on
how lengthening various latencies would affect IPC;
on a per-cycle basis the ALU latency is most
important, then L1 cache, then L2 cache, then branch
misprediction; however, the total effect of
lengthening the pipeline to double the clock rate
gives the reverse order (because branch
misprediction gains more cycles than the other
latencies). The paper reports 52 pipeline stages
with 1.96 times the original clock rate as optimal
for the Pentium~4 microarchitecture, resulting in a
reduction of 1.45 of core time and an overall
speedup of about 1.29 (including waiting for
memory). Various other topics are discussed, such as
nonlinear effects when introducing bypasses, and
varying cache sizes. Recommended reading.}
}
@Proceedings{isca02,
title = "$29^\textit{th}$ Annual International Symposium on Computer Architecture",
booktitle = "$29^\textit{th}$ Annual International Symposium on Computer Architecture",
year = "2002",
key = "ISCA 29",
}
- anton
--
M. Anton Ertl Some things have to be seen to be believed
anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html |
|