Itanium versus Others
CASTalk.com Forum Index CASTalk.com
Discussion of DSP, FPGA, storage and embedded system.
 
 FAQFAQ   MemberlistMemberlist     RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 
Google
 
Web castalk.com
Itanium versus Others
Goto page Previous  1, 2, 3, 4  Next
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture
Author Message
Robert Myers
Guest





Posted: Fri Jul 29, 2005 10:00 pm    Post subject: Re: Itanium versus Others Reply with quote

Quote:
From the POV of SPECFP CPU2000 published results, for whatever it is
worth, Itanium at 1.6GHz and Power5 at 1.9MHz are within spitting

distance of one another, as are versions of Itanium 2 with 3, 6, and 9
Mb cache, all in single core versions.

Quote:
From that single measure of performance, the statement that the
enormous cache size of Itanium makes a big difference is incorrect, as

is the statement that there is a big discrepancy in performance between
Itanium and Power5. Also, by that one measure, the best of the x86's
is only about three-quarters the best performance overall, which at the
moment is from Power5.

RM
Back to top
Mark Hahn
Guest





Posted: Sat Jul 30, 2005 12:15 am    Post subject: Re: Itanium versus Others Reply with quote

Quote:
From that single measure of performance, the statement that the
enormous cache size of Itanium makes a big difference is incorrect, as

no. it2's enormous cache does clearly make a huge difference. if you
don't believe, just recalculate it2's specFP results after having omitted
the obsolete-because-they-ridiculously-tiny components. think of it as
an outlier-rejection procedure.

I did this late last year, and it2 winds up barely faster than opteron.

regards, mark hahn.
sharcnet
Back to top
John Savard
Guest





Posted: Sat Jul 30, 2005 8:15 am    Post subject: Re: Itanium versus Others Reply with quote

On Thu, 28 Jul 2005 21:09:11 -0500, Del Cecchi <cecchinospam@us.ibm.com>
wrote, in part:

Quote:
You might want to recheck your benchmarks if
you think it takes 4 chips of Power5 to keep up with an Itanium.

One POWER5 module already *has* 4 chips inside it, each one dual-core.

John Savard
http://www.quadibloc.com/index.html
_________________________________________
Usenet Zone Free Binaries Usenet Server
More than 120,000 groups
Unlimited download
http://www.usenetzone.com to open account
Back to top
John Savard
Guest





Posted: Sat Jul 30, 2005 8:15 am    Post subject: Re: Itanium versus Others Reply with quote

On Thu, 28 Jul 2005 19:14:19 +0200, Zak <jute@zak.invalid> wrote, in
part:

Quote:
Intel wants Itanium
to succeed

Is it possible to design an Itanium 'lite' that doesn't use so much
silicon, and can be priced at a point where it might succeed? Better x86
performance, using the same ALUs as the Itanium instead of a little fast
486 on the side, and instead of just-in-time compilation, would help
too.

John Savard
http://www.quadibloc.com/index.html
_________________________________________
Usenet Zone Free Binaries Usenet Server
More than 120,000 groups
Unlimited download
http://www.usenetzone.com to open account
Back to top
Zak
Guest





Posted: Sat Jul 30, 2005 8:15 am    Post subject: Re: Itanium versus Others Reply with quote

John Savard wrote:
Quote:

Is it possible to design an Itanium 'lite' that doesn't use so much
silicon, and can be priced at a point where it might succeed? Better x86
performance, using the same ALUs as the Itanium instead of a little fast
486 on the side, and instead of just-in-time compilation, would help
too.

I think that was what they tried first. The little 486 didn't work, and
they have JIT now because it is faster.

And teh performance was just not there... and if Intel could get the
performance more cost-effectively than with the huge cache they would
have done it.

Also if they have a performance plan up their sleeves here I feel they
would not keep it secret, but promise a socket upgrade to who would buy
Itanium now.

I guess Itanium is non-mainstream as well in Intels plans because of the
AMD64 coup: x86 gets you 64 bits all right. What does Itanium get you?
Better performance in some niches? Verrry slow performance with old
applications. And oh, you're getting rid of old and cruft x86 - but who
cares about that? Al least the assembly language was designed to be
human usable.


Thomas
Back to top
Robert Myers
Guest





Posted: Sat Jul 30, 2005 4:15 pm    Post subject: Re: Itanium versus Others Reply with quote

Mark Hahn wrote:
Quote:
From that single measure of performance, the statement that the
enormous cache size of Itanium makes a big difference is incorrect, as

no. it2's enormous cache does clearly make a huge difference. if you
don't believe, just recalculate it2's specFP results after having omitted
the obsolete-because-they-ridiculously-tiny components. think of it as
an outlier-rejection procedure.

I did this late last year, and it2 winds up barely faster than opteron.


From figure of merit I cited, the statement I made is correct. The
only benchmark that matters is your own application.


But which are you saying? That, if you remove the "outliers"
(definition: data that don't meet your preconceptions), the importance
of the cache becomes apparent or that Itanium is no faster than Opteron
or both?

RM
Back to top
Guest






Posted: Sun Jul 31, 2005 12:15 am    Post subject: Re: Itanium versus Others Reply with quote

John Savard wrote:
Quote:
On Thu, 28 Jul 2005 21:09:11 -0500, Del Cecchi <cecchinospam@us.ibm.com
wrote, in part:

You might want to recheck your benchmarks if
you think it takes 4 chips of Power5 to keep up with an Itanium.

One POWER5 module already *has* 4 chips inside it, each one dual-core.

John Savard
http://www.quadibloc.com/index.html
_________________________________________
Usenet Zone Free Binaries Usenet Server
More than 120,000 groups
Unlimited download
http://www.usenetzone.com to open account

John,
You seriously misunderstood Power5.
Power5 chips could be packed into MCM but they don't have to. Only
high end pSeries models, p5-595 and p5-570, utilize MCMs. Smaller
pSeries and OpenPower machines use cheaper packaging (DCM that consists
of one Power5 chip and an external L3 cache). I think (not sure), the
same applies to iSeries.

As to performance per chip, dual-core Power5 has better throughput than
single core Itanium2 in just about all available (throughput)
benchmarks. The factor, by each Power5 is faster depends on the
benchmark.
For example, in SpecInt2k_rate_base single Power5 chip scores 31.6 vs.
I2 17.8. In SpecFfp2k_rate_base - Power5=41.5 vs I2=31.2.
In bigger systems the disparity grows:
For four sockets:
SpecInt2k_rate_base Power5=159, I2=72.5
SpecFp2k_rate_base Power5=266, I2=104

For commercial workloads Itanium is simply no match to Power5.
For example, in famous TPC-C OLTP benchmark p5 570 with 8 Power5 cores
(4 chips, 1 MCM) achieves 429,899.7 transaction per minute. Itanium2
can't achieve such score even with twice the number of CPU cores - the
best I2 result for 16 cores =332,265.87 transaction per minute.

Intel should somehow close the gap at the beginning of the next year
with their incoming dual-core Itanium (Montecito) but right now there
is only one king.
Back to top
Robert Klute
Guest





Posted: Sun Jul 31, 2005 6:06 am    Post subject: Re: Itanium versus Others Reply with quote

On 30 Jul 2005 13:42:09 -0700, already5chosen@yahoo.com wrote:

Quote:

John Savard wrote:
On Thu, 28 Jul 2005 21:09:11 -0500, Del Cecchi <cecchinospam@us.ibm.com
wrote, in part:

You might want to recheck your benchmarks if
you think it takes 4 chips of Power5 to keep up with an Itanium.

One POWER5 module already *has* 4 chips inside it, each one dual-core.

John Savard
http://www.quadibloc.com/index.html
_________________________________________
Usenet Zone Free Binaries Usenet Server
More than 120,000 groups
Unlimited download
http://www.usenetzone.com to open account

John,
You seriously misunderstood Power5.
Power5 chips could be packed into MCM but they don't have to. Only
high end pSeries models, p5-595 and p5-570, utilize MCMs. Smaller
pSeries and OpenPower machines use cheaper packaging (DCM that consists
of one Power5 chip and an external L3 cache). I think (not sure), the
same applies to iSeries.

As to performance per chip, dual-core Power5 has better throughput than
single core Itanium2 in just about all available (throughput)
benchmarks. The factor, by each Power5 is faster depends on the
benchmark.
For example, in SpecInt2k_rate_base single Power5 chip scores 31.6 vs.
I2 17.8. In SpecFfp2k_rate_base - Power5=41.5 vs I2=31.2.

Sort of a mismatch - the IBM number is 2 cores / 4 threads. The HP
number is 1 core / 1 thread. If you compare to 2 cores / 2 thread the
number is 33.2. Number of threads is important to the performance number
on SPEC_rate.


Quote:
In bigger systems the disparity grows:
For four sockets:
SpecInt2k_rate_base Power5=159, I2=72.5
SpecFp2k_rate_base Power5=266, I2=104

Again, you are comparing an IBM number that is twice the cores and 4
times the threads to the Itanium number.
Quote:

For commercial workloads Itanium is simply no match to Power5.
For example, in famous TPC-C OLTP benchmark p5 570 with 8 Power5 cores
(4 chips, 1 MCM) achieves 429,899.7 transaction per minute. Itanium2
can't achieve such score even with twice the number of CPU cores - the
best I2 result for 16 cores =332,265.87 transaction per minute.

If you look at comparable IBM P5 to Itanium numbers, performance is
comparable.

SPEC 1 CPU Performance
intBase int fpBase fp
Itanium 2 1.6/9M 1590 1590 2712 2712
P5 595 1.9 1392 1452 2585 2796

Even on the SPEC_rate numbers where IBM can run twice the number of
threads, as a result of SMT, the Int rate is about the same as Itanium.
It is only in fp that having twice the threads delivers an advantage.
SPECRate Performance
int_rate_base int_rate fp_rate_base fp_rate
Itanium 2 SD 64CPU 1108 1108 846 928
P5 595 1.9 64CPU SMT on 1063 1147 1684 1752

Quote:
Intel should somehow close the gap at the beginning of the next year
with their incoming dual-core Itanium (Montecito) but right now there
is only one king.

Once Montecito is out, with dual core per socket and hyperthreading,
your comparisons will at least be valid.
Back to top
Bill Todd
Guest





Posted: Sun Jul 31, 2005 8:15 am    Post subject: Re: Itanium versus Others Reply with quote

Robert Klute wrote:
Quote:
On 30 Jul 2005 13:42:09 -0700, already5chosen@yahoo.com wrote:


John Savard wrote:

On Thu, 28 Jul 2005 21:09:11 -0500, Del Cecchi <cecchinospam@us.ibm.com
wrote, in part:


You might want to recheck your benchmarks if
you think it takes 4 chips of Power5 to keep up with an Itanium.

One POWER5 module already *has* 4 chips inside it, each one dual-core.

John Savard
http://www.quadibloc.com/index.html
_________________________________________
Usenet Zone Free Binaries Usenet Server
More than 120,000 groups
Unlimited download
http://www.usenetzone.com to open account

John,
You seriously misunderstood Power5.
Power5 chips could be packed into MCM but they don't have to. Only
high end pSeries models, p5-595 and p5-570, utilize MCMs. Smaller
pSeries and OpenPower machines use cheaper packaging (DCM that consists
of one Power5 chip and an external L3 cache). I think (not sure), the
same applies to iSeries.

As to performance per chip, dual-core Power5 has better throughput than
single core Itanium2 in just about all available (throughput)
benchmarks. The factor, by each Power5 is faster depends on the
benchmark.
For example, in SpecInt2k_rate_base single Power5 chip scores 31.6 vs.
I2 17.8. In SpecFfp2k_rate_base - Power5=41.5 vs I2=31.2.


Sort of a mismatch

Not at all: the metric very specifically stated above was 'performance
per chip'. But I'll be happy to inform you about other metrics as well,
since you seem interested.

Quote:
- the IBM number is 2 cores / 4 threads.

For a mid-range, 1.65 GHz POWER5 chip rather than a top-of-the-line 1.9
GHz chip, one might note. Its big brother (not tested in a 2-core
single-chip configuration) manages a considerably more respectable base
score of 125 for 4 cores (two chips) - or 31.25 per core, almost exactly
the same as the single-core Itanic result (and it's not clear that
dual-threading helps all that much in this benchmark, since the
SPECfp_base scores are so close without use of SMT on the POWER5 chip).

The HP
Quote:
number is 1 core / 1 thread.

For a top-speed (1.6 GHz) Itanic. Not the largest L3 cache version, but
considering that it marginally beats an otherwise identical brother
which boasts twice as much cache that doesn't seem like any handicap in
this benchmark.

Of course, that's for a single-processor system. IBM doesn't test
POWER5 SPECfp_rate systems that small, so we don't know how much (if
any) better it would have done there. But we can see how well Itanic
does at the same system size (4 cores) as the smallest top-of-the-line
POWER5 system: 104 for a top-of-the-line 4-core Itanic (from SGI, this
time), or only 26 per core vs. POWER5's 31.25 per core.

If you compare to 2 cores / 2 thread the
Quote:
number is 33.2. Number of threads is important to the performance number
on SPEC_rate.

Leaving aside that last phrase (since I already questioned it above),
I'm afraid that I'm having some difficulty finding any such result for
POWER5 at spec.org: could you provide the test URL?

Quote:



In bigger systems the disparity grows:
For four sockets:
SpecInt2k_rate_base Power5=159, I2=72.5
SpecFp2k_rate_base Power5=266, I2=104


Again, you are comparing an IBM number that is twice the cores and 4
times the threads to the Itanium number.

I think I addressed both aspects of that challenge adequately just above
- though again I'll remind you that the specific context the comment you
responded to was performance-per-chip (or performance-per-socket), not
performance-per-core, let alone performance-per-thread.

Quote:

For commercial workloads Itanium is simply no match to Power5.
For example, in famous TPC-C OLTP benchmark p5 570 with 8 Power5 cores
(4 chips, 1 MCM) achieves 429,899.7 transaction per minute. Itanium2
can't achieve such score even with twice the number of CPU cores - the
best I2 result for 16 cores =332,265.87 transaction per minute.


If you look at comparable IBM P5 to Itanium numbers, performance is
comparable.

That is utter horseshit, even if you restrict comparisons on commercial
workloads (the subject to which you would appear to be responding above,
you will note) to equal numbers of cores rather than equal numbers of
sockets/chips: POWER5 out-scores Itanic by per-core factors of 3:1 on
large-system benchmarks like TPC-C and SAP SD 2-tier, and by 2:1 on
others like jbb2000.

Unless, of course, you prefer just to completely ignore commercial
benchmarks - as you seem to be doing by immediately trying to wrench the
conversation back to SPECint/fp below.

Quote:

SPEC 1 CPU Performance
intBase int fpBase fp
Itanium 2 1.6/9M 1590 1590 2712 2712
P5 595 1.9 1392 1452 2585 2796

Even on the SPEC_rate numbers where IBM can run twice the number of
threads, as a result of SMT, the Int rate is about the same as Itanium.

There's that claim again - despite the fact that POWER5 with SMT
disabled manages to run both SPECint and SPECfp (not just rates
versions) at approximately the same speed at Itanic.

Quote:
It is only in fp that having twice the threads delivers an advantage.

Sorry: POWER5 does have an advantage in the benchmark results below,
but it likely has little or nothing to do with running twice as many
threads.

Quote:
SPECRate Performance
int_rate_base int_rate fp_rate_base fp_rate
Itanium 2 SD 64CPU 1108 1108 846 928
P5 595 1.9 64CPU SMT on 1063 1147 1684 1752

The problem Itanic faces above is most likely not lack of SMT but lack
of a decently-scaling server system to run on. While POWER5 systems
scale not too far from linearly up to absurd sizes, HP's Itanic systems
take substantial hits (in this case perhaps due to bandwidth
limitations, since bandwidth is what SPECfp_rate tends to stress) as
they increase in size: 4-socket systems provide nothing like 4x the
SPECfp single-socket performance and 16-socket systems provide nothing
like 4x the SPECfp 4-socket system performance (once you get to 16
sockets Superdome scales fairly linearly up to 64 sockets - at least in
SPECfp_rate - but by then the damage has been done).

I'll suggest that the fact that SGI has posted a 64-socket Itanic
SPECfp_rate score of 1596 (perhaps you missed it) tends to substantiate
the above - and it's not because they're running a few percent faster
clock rate or have more on-chip cache. HP better have done a much
better job with their next spin of Integrity/Superdome than they did
with the current one - though since back when the current one was being
designed they already should have understood its limitations (based on
their experience with the previous iteration plus knowledge of what
POWER, Alpha, and even SPARC had to offer) I'm not counting on it.

- bill
Back to top
Bill Todd
Guest





Posted: Sun Jul 31, 2005 8:15 am    Post subject: Re: Itanium versus Others Reply with quote

Bill Todd wrote:

....

POWER5 out-scores Itanic by per-core factors of 3:1 on
Quote:
large-system benchmarks like TPC-C and SAP SD 2-tier, and by 2:1 on
others like jbb2000.

Sorry - that should be 'SAP SD 3-tier' above: each POWER5 core is worth
only about 2 Itanic cores on SAP SD 2-tier.

- bill
Back to top
Robert Klute
Guest





Posted: Mon Aug 01, 2005 12:15 am    Post subject: Re: Itanium versus Others Reply with quote

On Sun, 31 Jul 2005 00:26:31 -0400, Bill Todd <billtodd@metrocast.net>
wrote:

Quote:
Robert Klute wrote:
On 30 Jul 2005 13:42:09 -0700, already5chosen@yahoo.com wrote:


John Savard wrote:

On Thu, 28 Jul 2005 21:09:11 -0500, Del Cecchi <cecchinospam@us.ibm.com
wrote, in part:


You might want to recheck your benchmarks if
you think it takes 4 chips of Power5 to keep up with an Itanium.

One POWER5 module already *has* 4 chips inside it, each one dual-core.

John Savard
http://www.quadibloc.com/index.html
_________________________________________
Usenet Zone Free Binaries Usenet Server
More than 120,000 groups
Unlimited download
http://www.usenetzone.com to open account

John,
You seriously misunderstood Power5.
Power5 chips could be packed into MCM but they don't have to. Only
high end pSeries models, p5-595 and p5-570, utilize MCMs. Smaller
pSeries and OpenPower machines use cheaper packaging (DCM that consists
of one Power5 chip and an external L3 cache). I think (not sure), the
same applies to iSeries.

As to performance per chip, dual-core Power5 has better throughput than
single core Itanium2 in just about all available (throughput)
benchmarks. The factor, by each Power5 is faster depends on the
benchmark.
For example, in SpecInt2k_rate_base single Power5 chip scores 31.6 vs.
I2 17.8. In SpecFfp2k_rate_base - Power5=41.5 vs I2=31.2.


Sort of a mismatch

Not at all: the metric very specifically stated above was 'performance
per chip'. But I'll be happy to inform you about other metrics as well,
since you seem interested.

I know the metrics and where to get them. Still, comparing multi-core
chip to single core chip is not exactly 'all things being equal'.

Quote:
- the IBM number is 2 cores / 4 threads.

For a mid-range, 1.65 GHz POWER5 chip rather than a top-of-the-line 1.9
GHz chip, one might note. Its big brother (not tested in a 2-core
single-chip configuration) manages a considerably more respectable base
score of 125 for 4 cores (two chips) - or 31.25 per core, almost exactly
the same as the single-core Itanic result (and it's not clear that
dual-threading helps all that much in this benchmark, since the
SPECfp_base scores are so close without use of SMT on the POWER5 chip).

I noticed that, but since the comparison was a 1.65 P5 vs a 1.6 Itanium,
I didn't see the need to mention it. Also, I did post the 1.9 vs 1.6
stats below.
Quote:

The HP
number is 1 core / 1 thread.

For a top-speed (1.6 GHz) Itanic. Not the largest L3 cache version, but
considering that it marginally beats an otherwise identical brother
which boasts twice as much cache that doesn't seem like any handicap in
this benchmark.

The difference is the box. SPEC is more than the chip, it is bus and
memory speed too. the 3M number is from an rx1620 with a faster front
side bus than the other, older, numbers.

Quote:
Of course, that's for a single-processor system. IBM doesn't test
POWER5 SPECfp_rate systems that small, so we don't know how much (if
any) better it would have done there. But we can see how well Itanic
does at the same system size (4 cores) as the smallest top-of-the-line
POWER5 system: 104 for a top-of-the-line 4-core Itanic (from SGI, this
time), or only 26 per core vs. POWER5's 31.25 per core.

Yes, the only way to get single core numbers from IBM is to look at the
non-rate numbers.

Quote:
If you compare to 2 cores / 2 thread the
number is 33.2. Number of threads is important to the performance number
on SPEC_rate.

Leaving aside that last phrase (since I already questioned it above),
I'm afraid that I'm having some difficulty finding any such result for
POWER5 at spec.org: could you provide the test URL?

2 core / 2 thread is the Itanium 2 number for rx1620 with 2 1.6/3 cpus
http://www.spec.org/cpu2000/results/res2004q4/cpu2000-20041101-03477.asc


Quote:
In bigger systems the disparity grows:
For four sockets:
SpecInt2k_rate_base Power5=159, I2=72.5
SpecFp2k_rate_base Power5=266, I2=104


Again, you are comparing an IBM number that is twice the cores and 4
times the threads to the Itanium number.

I think I addressed both aspects of that challenge adequately just above
- though again I'll remind you that the specific context the comment you
responded to was performance-per-chip (or performance-per-socket), not
performance-per-core, let alone performance-per-thread.

I would grant some leeway on threads per core, but still will object to
per socket comparisons over comparing cores.


Quote:
For commercial workloads Itanium is simply no match to Power5.
For example, in famous TPC-C OLTP benchmark p5 570 with 8 Power5 cores
(4 chips, 1 MCM) achieves 429,899.7 transaction per minute. Itanium2
can't achieve such score even with twice the number of CPU cores - the
best I2 result for 16 cores =332,265.87 transaction per minute.


If you look at comparable IBM P5 to Itanium numbers, performance is
comparable.

That is utter horseshit, even if you restrict comparisons on commercial
workloads (the subject to which you would appear to be responding above,
you will note) to equal numbers of cores rather than equal numbers of
sockets/chips: POWER5 out-scores Itanic by per-core factors of 3:1 on
large-system benchmarks like TPC-C and SAP SD 2-tier, and by 2:1 on
others like jbb2000.

Unless, of course, you prefer just to completely ignore commercial
benchmarks - as you seem to be doing by immediately trying to wrench the
conversation back to SPECint/fp below.

I was just restricting my comparison to the comparison at hand - SPEC.
I even posted IBM's best (1.9) vs HP's best (1.6 where available, 1.5
otherwise) below.

Quote:

SPEC 1 CPU Performance
intBase int fpBase fp
Itanium 2 1.6/9M 1590 1590 2712 2712
P5 595 1.9 1392 1452 2585 2796

Even on the SPEC_rate numbers where IBM can run twice the number of
threads, as a result of SMT, the Int rate is about the same as Itanium.

There's that claim again - despite the fact that POWER5 with SMT
disabled manages to run both SPECint and SPECfp (not just rates
versions) at approximately the same speed at Itanic.

Single core is single thread. EVERY vendor runs this with
multi-threading/SMT/hyperthreading/whatever disabled - it actually slows
the performance.


Quote:
It is only in fp that having twice the threads delivers an advantage.

Sorry: POWER5 does have an advantage in the benchmark results below,
but it likely has little or nothing to do with running twice as many
threads.

On SPEC integer workloads, even with IBM enabling SMT and HP running on
the older 1.5GHz based system, a 64 core SD and a 64 core 595 perform
about the same. On FP the P5 performance is twice that of the SD.
Quote:

SPECRate Performance
int_rate_base int_rate fp_rate_base fp_rate
Itanium 2 SD 64CPU 1108 1108 846 928
P5 595 1.9 64CPU SMT on 1063 1147 1684 1752

The problem Itanic faces above is most likely not lack of SMT but lack
of a decently-scaling server system to run on. While POWER5 systems
scale not too far from linearly up to absurd sizes, HP's Itanic systems
take substantial hits (in this case perhaps due to bandwidth
limitations, since bandwidth is what SPECfp_rate tends to stress) as
they increase in size: 4-socket systems provide nothing like 4x the
SPECfp single-socket performance and 16-socket systems provide nothing
like 4x the SPECfp 4-socket system performance (once you get to 16
sockets Superdome scales fairly linearly up to 64 sockets - at least in
SPECfp_rate - but by then the damage has been done).

The change from 4 socket to larger systems is a change in architecture
from runway bus to crossbar, with its higher memory latency, and not a
strict scaling problem.

Quote:
I'll suggest that the fact that SGI has posted a 64-socket Itanic
SPECfp_rate score of 1596 (perhaps you missed it) tends to substantiate
the above - and it's not because they're running a few percent faster
clock rate or have more on-chip cache. HP better have done a much
better job with their next spin of Integrity/Superdome than they did
with the current one - though since back when the current one was being
designed they already should have understood its limitations (based on
their experience with the previous iteration plus knowledge of what
POWER, Alpha, and even SPARC had to offer) I'm not counting on it.

I didn't miss it, I was just sticking to a single vendor - HP. ALthough
I probably should have used the SGI number, since it uses the 1.6/9M
chip and would have been more consistent in the numbers I posted, even
it it represented an architecture change.
Back to top
Rupert Pigott
Guest





Posted: Mon Aug 01, 2005 5:39 am    Post subject: Re: Itanium versus Others Reply with quote

John Savard wrote:

[SNIP]

Quote:
Since the POWER5 appears 'exotic', it does seem to me that the Itanium
is just about the only 'supercomputer-like' chip easily available; and,
of course, a single core providing the same throughput as eight cores
will do it with considerably less latency as well.

Hardly exotic. Multi-core chips have been with us for a while now, and
it sells more significantly more units than Itanic does (if you believe
the market analysts). These days you can buy multi-core chips over the
counter for your PC (AMD X2s and the new Opterons). :)
Back to top
David Kanter
Guest





Posted: Mon Aug 01, 2005 8:15 am    Post subject: Re: Itanium versus Others Reply with quote

Quote:
Hardly exotic. Multi-core chips have been with us for a while now, and
it sells more significantly more units than Itanic does (if you believe
the market analysts). These days you can buy multi-core chips over the
counter for your PC (AMD X2s and the new Opterons). :)

That wasn't quite what I meant.

The POWER5 is simply 'exotic' because it is less easy to find over the
counter than the Itanium.

There's an understatement.

Quote:
As for multi-core, what I have against it is simply that it isn't really
better - except for Microsoft licensing policies - than just having
several chips on the same motherboard. Which avoids yield issues (such
as, of course, bedevil the Itanium).

I strongly disagree. The interconnect between two CPUs on the same die
can be driven must faster and wider than an interconnect over PCB.
Similarly two CPUs can then share the highest level of cache, which
lets you do all sorts of really neat tricks with locks, sharing and
your CC protocol.

Furthermore, I doubt Intel has yield problems with Itanium; I have yet
to see any proof of the sort, and the nature of the chip (mostly cache)
suggests that Itanium yields better than many other competing chips. I
think you meant to refer to binning problems (having to run both CPUs
at the slowest speed of the two), is that correct?

Quote:
If one looks at *latency* *instead* of throughput, because throughput
can be increased without limit by using more of the same kind of chip -
whatever kind of chip - then it only stands to naive reason that if chip
A gets the same throughput as chip B with half the cores... it likely
has half the latency.

What architecture, or design, or chip, relatively easily available (i.e.
without buying a whole SX-6r around it) has the most throughput per core
- or, more specifically, the lowest latency?

Could you define your terms a little specifically? What are throughput
and latency measured in? TPC-C transactions/sec and the latency to
finish a transaction?

David
Back to top
John Savard
Guest





Posted: Mon Aug 01, 2005 8:15 am    Post subject: Re: Itanium versus Others Reply with quote

On 31 Jul 2005 17:39:08 -0700, "Rupert Pigott" <darkboong@hotmail.com>
wrote, in part:

Quote:
Hardly exotic. Multi-core chips have been with us for a while now, and
it sells more significantly more units than Itanic does (if you believe
the market analysts). These days you can buy multi-core chips over the
counter for your PC (AMD X2s and the new Opterons). :)

That wasn't quite what I meant.

The POWER5 is simply 'exotic' because it is less easy to find over the
counter than the Itanium.

As for multi-core, what I have against it is simply that it isn't really
better - except for Microsoft licensing policies - than just having
several chips on the same motherboard. Which avoids yield issues (such
as, of course, bedevil the Itanium).

If one looks at *latency* *instead* of throughput, because throughput
can be increased without limit by using more of the same kind of chip -
whatever kind of chip - then it only stands to naive reason that if chip
A gets the same throughput as chip B with half the cores... it likely
has half the latency.

What architecture, or design, or chip, relatively easily available (i.e.
without buying a whole SX-6r around it) has the most throughput per core
- or, more specifically, the lowest latency?

Is the Itanium the answer? If so, Intel is right to believe in it.

John Savard
http://www.quadibloc.com/index.html
_________________________________________
Usenet Zone Free Binaries Usenet Server
More than 120,000 groups
Unlimited download
http://www.usenetzone.com to open account
Back to top
Bill Todd
Guest





Posted: Mon Aug 01, 2005 4:15 pm    Post subject: Re: Itanium versus Others Reply with quote

John Savard wrote:
Quote:
On 31 Jul 2005 17:39:08 -0700, "Rupert Pigott" <darkboong@hotmail.com
wrote, in part:


Hardly exotic. Multi-core chips have been with us for a while now, and
it sells more significantly more units than Itanic does (if you believe
the market analysts). These days you can buy multi-core chips over the
counter for your PC (AMD X2s and the new Opterons). :)


That wasn't quite what I meant.

The POWER5 is simply 'exotic' because it is less easy to find over the
counter than the Itanium.

As for multi-core, what I have against it is simply that it isn't really
better - except for Microsoft licensing policies - than just having
several chips on the same motherboard.

That is blatantly incorrect.

Besides the performance aspects which David just mentioned, multi-core
chips can significantly reduce associated board costs for a given level
of performance - not only by reducing the parts and trace counts but by
allowing the use of higher-volume (and hence far lower-price) boards
(since board volume over which development costs can be spread tends to
decrease wildly as the socket count increases).

Which avoids yield issues (such
Quote:
as, of course, bedevil the Itanium).

Why? Leaving aside your questionable implication about Itanic yields,
when you have multiple cores on a chip you don't have to throw the whole
thing away if one core has a defect in it.

Quote:

If one looks at *latency* *instead* of throughput, because throughput
can be increased without limit by using more of the same kind of chip -
whatever kind of chip - then it only stands to naive reason that if chip
A gets the same throughput as chip B with half the cores... it likely
has half the latency.

'Naive' being the very operative term here. Latency and throughput are
largely orthogonal to each other for many workloads, rather than
enjoying the kind of relationship you suggest. There are streaming
workloads, for example, where latency is close to wholly irrelevant, but
throughput is king.

- bill
Back to top
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture All times are GMT
Goto page Previous  1, 2, 3, 4  Next
Page 2 of 4

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




VoIP Electronics Powered by phpBB