| Author |
Message |
Oliver S.
Guest
|
Posted:
Fri Sep 23, 2005 4:15 pm Post subject:
What do you think of Sun's Niagara |
|
|
I've recently read an article on Ace's Hardware about Sun's upcoming
Niagara-CPU ([1]). There's a lot of speculation and vage sophistry in
this article but this are the facts about this CPU:
- Ultra-Sparc compatible architecture
- eight cores with each 32kB data-cache and 64kB instruction-cache
- 4 or 8MB L2-cache
- each core is a simple in-order core with can execute one
instruction per clock-cycle
- each core swaps to another thread when an instruction stalls,
f.e. by accessing the L2-cache
- integrated ethernet-controller with tcp/ip offload engine to
improve tcp/ip-throughput
- unfortunately no SMP-support in the first version
- FP-performance which isn't worth to be mentioned (and even not
necessary for a server-cpu)
- about
I think that this is the best way for Sun to go. Sun doesn't seem to
be able to catch up with the single-threading performance of current
x86-cores, but fortunately, the're mainly in the server-market where
single-threaded performance doesn't play a big role; in contrast to
the x86-market where single-threaded performance counts a lot because
most x86-CPUs sold are desktop-CPUs. Thus, designing CPUs with simple
cores like the Niagara-cores isn't adoptable by AMD or Intel.
[1] http://www.aceshardware.com/read.jsp?id=65000292 |
|
| Back to top |
|
 |
Joe Seigh
Guest
|
Posted:
Fri Sep 23, 2005 4:15 pm Post subject:
Re: What do you think of Sun's Niagara |
|
|
Oliver S. wrote:
| Quote: | I've recently read an article on Ace's Hardware about Sun's upcoming
Niagara-CPU ([1]). There's a lot of speculation and vage sophistry in
this article but this are the facts about this CPU:
[...]
I think that this is the best way for Sun to go. Sun doesn't seem to
be able to catch up with the single-threading performance of current
x86-cores, but fortunately, the're mainly in the server-market where
single-threaded performance doesn't play a big role; in contrast to
the x86-market where single-threaded performance counts a lot because
most x86-CPUs sold are desktop-CPUs. Thus, designing CPUs with simple
cores like the Niagara-cores isn't adoptable by AMD or Intel.
[1] http://www.aceshardware.com/read.jsp?id=65000292
|
Their present strategy seems to be a commodity based one, i.e. an
8x4 Niagara based server will be cheaper than 16 dual core Opteron
based blade servers on a cost/performance basis.
If and when Sun goes to SMP support, or anybody for that matter
with massively multicored cpus, they are going to have to depend
on the applications exploiting these things. They're sort of
employing a "Field of Dreams" strategy, "If you build it, they
will come". This is sort of ignoring the fact that software,
not just hardware, has a development cycle. One that is longer
in fact. And software development is a little reactive. They
don't start trying to exploit these things until after they
become available.
Couple that with the fact that software development using distributed
programming on networks of cheap commodity hardware has been around
for a while and Sun's position doesn't look any better.
--
Joe Seigh
When you get lemons, you make lemonade.
When you get hardware, you make software. |
|
| Back to top |
|
 |
Oliver S.
Guest
|
Posted:
Fri Sep 23, 2005 9:53 pm Post subject:
Re: What do you think of Sun's Niagara |
|
|
| Quote: | - 4 or 8MB L2-cache
|
I read today, that Niagara will have "only" 3MB L2-Cache.
| Quote: | - <other features> ...
|
An additional "feature" is, that the cores have no branch-predictor;
but that's quite ok when a core can switch to another thread when a
branch has been taken. And for tight loops, loop-unrolling by the
compiler will become usual again. |
|
| Back to top |
|
 |
Guest
|
Posted:
Sat Sep 24, 2005 12:00 am Post subject:
Re: What do you think of Sun's Niagara |
|
|
| Quote: | I read today, that Niagara will have "only" 3MB L2-Cache.
|
So many threads competing for so little cache..... |
|
| Back to top |
|
 |
Oliver S.
Guest
|
Posted:
Sat Sep 24, 2005 12:15 am Post subject:
Re: What do you think of Sun's Niagara |
|
|
| Quote: | So many threads competing for so little cache.....
|
That won't be different if you share the cache with the same number
of threads on a uniprocessor machine capable of executing only one
thread.
But I forget to mention an additional feature of Niagara: As running
such a large number of threads lets the cpu-design become more band-
width dependent and less latency-dependent, Niagara features four
dual-channel DDR2-channels; so "only" having 3MB L2-cache isn't
such a big problem. |
|
| Back to top |
|
 |
Terje Mathisen
Guest
|
Posted:
Sat Sep 24, 2005 12:15 am Post subject:
Re: What do you think of Sun's Niagara |
|
|
Oliver S. wrote:
| Quote: | So many threads competing for so little cache.....
That won't be different if you share the cache with the same number
of threads on a uniprocessor machine capable of executing only one
thread.
But I forget to mention an additional feature of Niagara: As running
such a large number of threads lets the cpu-design become more band-
width dependent and less latency-dependent, Niagara features four
dual-channel DDR2-channels; so "only" having 3MB L2-cache isn't
such a big problem.
|
Is it just me or does this sound like a watered-down version of Tera?
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching" |
|
| Back to top |
|
 |
John Mashey
Guest
|
Posted:
Sat Sep 24, 2005 12:15 am Post subject:
Re: What do you think of Sun's Niagara |
|
|
Terje Mathisen wrote:
| Quote: | Oliver S. wrote:
So many threads competing for so little cache.....
That won't be different if you share the cache with the same number
of threads on a uniprocessor machine capable of executing only one
thread.
But I forget to mention an additional feature of Niagara: As running
such a large number of threads lets the cpu-design become more band-
width dependent and less latency-dependent, Niagara features four
dual-channel DDR2-channels; so "only" having 3MB L2-cache isn't
such a big problem.
Is it just me or does this sound like a watered-down version of Tera?
|
I don't think so [and I've been on panelswith Burton Smith a couple
times.]
TERA MTA was using threads primarily to boost performance on big
parallel technical compute problems, depending strongly on compilers to
do the Right Thing. As far as I can tell, the compiler work (Preston
Briggs & co) was first-rate, but this is a tough problem. In general,
multiprocessors (or equivalents) have had far more commercial success
in running throughput problems of unrelated or loosely-coupled tasks,
than in trying parallelize a single task and make it run fast. The
latter is very important to those who care (big science and
engineering), and it can be very effective on certain kinds of code
(big CFD, FE), but I don't think that's Niagra's target.
It is closer to the Raza XLR that I mentioned before, although that is
really a network processor, but there are quite a few similarities - [
8 4-threaded relatively simple CPUs]. |
|
| Back to top |
|
 |
Dan Koren
Guest
|
Posted:
Sat Sep 24, 2005 6:32 am Post subject:
Re: What do you think of Sun's Niagara |
|
|
"Terje Mathisen" <terje.mathisen@hda.hydro.com> wrote in message
news:dh1pmd$kit$1@osl016lin.hda.hydro.com...
| Quote: | Oliver S. wrote:
So many threads competing for so little cache.....
That won't be different if you share the cache with the same number
of threads on a uniprocessor machine capable of executing only one
thread.
But I forget to mention an additional feature of Niagara: As running
such a large number of threads lets the cpu-design become more band-
width dependent and less latency-dependent, Niagara features four
dual-channel DDR2-channels; so "only" having 3MB L2-cache isn't
such a big problem.
Is it just me or does this sound like a watered-down version of Tera?
|
If memory serves, the original Tera design was
synchronous.
dk |
|
| Back to top |
|
 |
Seongbae Park
Guest
|
Posted:
Sat Sep 24, 2005 8:15 am Post subject:
Re: What do you think of Sun's Niagara |
|
|
Terje Mathisen wrote:
| Quote: | Oliver S. wrote:
So many threads competing for so little cache.....
That won't be different if you share the cache with the same number
of threads on a uniprocessor machine capable of executing only one
thread.
But I forget to mention an additional feature of Niagara: As running
such a large number of threads lets the cpu-design become more band-
width dependent and less latency-dependent, Niagara features four
dual-channel DDR2-channels; so "only" having 3MB L2-cache isn't
such a big problem.
Is it just me or does this sound like a watered-down version of Tera?
Terje
|
On the casual look, they sound similar
because both try to increase the utilization of memory bandwidth
and execution resources by running multiple threads,
and both can do a context switch every cycle with no overhead.
But the similarity stops about there.
The differences I can see:
Tera targets HPTC whereas Niagara is for commercial/network workload.
Tera had more threads (128, IIRC) than Niagara's 32.
IIRC, Tera's more like a single core with 128 threads
whereas Niagara is really 8 independent cores with 4 threads each.
Tera's maximum possible issue rate per thread is something like
1 per 20 cycles or so, whereas each core of Niagara can issue
continuously issue one instruction per cycle from a single thread
if other threads are all idle.
Tera is VLIW, and Niagara is single-scalar.
Tera supports multiprocessors but Niagara is strictly a single chip.
Tera doesn't have cache, but Niagara does.
Tera has good floating-point unit, but Niagara doesn't.
Tera relies on a good compiler for auto-parallization and
other features, but Niagara doesn't.
Tera needs "porting" of the user application
whereas Niagara doesn't.
Tera has a fine grained synchronization feature
whereas Niagara has only traditional.
Tera-based systems and its target market is much more high-end
whereas Niagara is targetted for a totally different
segment of market.
Power consumption was never a worry for Tera
whereas it was a very important design factor for Niagara.
--
#pragma ident "Seongbae Park, compiler, http://blogs.sun.com/seongbae/" |
|
| Back to top |
|
 |
Alex Colvin
Guest
|
Posted:
Sat Sep 24, 2005 1:06 pm Post subject:
Re: What do you think of Sun's Niagara |
|
|
| Quote: | Is it just me or does this sound like a watered-down version of Tera?
|
missing some of the interesting bits, namely the full/empty memory tags
for interthread synchronization. sun's looks like a server engine
rather than an array processor.
--
mac the naïf |
|
| Back to top |
|
 |
Nick Maclaren
Guest
|
Posted:
Sat Sep 24, 2005 2:07 pm Post subject:
Re: What do you think of Sun's Niagara |
|
|
In article <dh1pmd$kit$1@osl016lin.hda.hydro.com>,
Terje Mathisen <terje.mathisen@hda.hydro.com> wrote:
| Quote: |
Is it just me or does this sound like a watered-down version of Tera?
|
No, not at all. I am a bit pissed off with Sun's approach to supplying
information, as I have neither been able to get the information I want
on it nor to find a decent public writeup. So I regret that I can't
post what I do know, despite being pretty sure that it is no longer
confidential.
To me, it sounds interesting, but its success will depend very much
on how well the software can support it.
Regards,
Nick Maclaren. |
|
| Back to top |
|
 |
Terje Mathisen
Guest
|
Posted:
Sat Sep 24, 2005 4:15 pm Post subject:
Re: What do you think of Sun's Niagara |
|
|
Alex Colvin wrote:
| Quote: | Is it just me or does this sound like a watered-down version of Tera?
missing some of the interesting bits, namely the full/empty memory tags
for interthread synchronization. sun's looks like a server engine
rather than an array processor.
|
I agree, the absense of proper fp, the presense of cache etc, makes it
obvious they were designed for different target markets.
The part that reminds of Tera is the _very_low_overhead_ thread
switching, in the hope that the next thread might be able to get some
useful work done while this one is waiting.
Or did I misunderstand this as well?
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching" |
|
| Back to top |
|
 |
Milos Becvar
Guest
|
Posted:
Mon Sep 26, 2005 12:15 am Post subject:
Re: What do you think of Sun's Niagara |
|
|
Nick Maclaren wrote:
| Quote: | In article <dh1pmd$kit$1@osl016lin.hda.hydro.com>,
Terje Mathisen <terje.mathisen@hda.hydro.com> wrote:
Is it just me or does this sound like a watered-down version of Tera?
No, not at all. I am a bit pissed off with Sun's approach to supplying
information, as I have neither been able to get the information I want
on it nor to find a decent public writeup. So I regret that I can't
post what I do know, despite being pretty sure that it is no longer
confidential.
To me, it sounds interesting, but its success will depend very much
on how well the software can support it.
Regards,
Nick Maclaren.
|
There is relatively detailed description of Niagara architecture
in March-April 2005 issue of IEEE Micro.
It looks like "cut and paste" version of 5-stage integer SPARC
pipeline found in textbooks only with additional stage for ability to
switch between 4 threads on cycle by cycle basis (fine-grained
multithreading as in TERA).
It is also amazing that every thread has 8 SPARC register windows
available and such register file is accessible within a single cycle
(they use slow backup memory to save rest of register windows and only
current window + global registers are implemented using fast multiported
register file. Transfers between register file and backup memory occur
during save and restore instructions - I think that 16 64b registers are
copied during two cycles. This latency is hidden by other threads. )
The only question for me is relatively small cache size of 16KB for L1I
and 8KB for L1D. And shared L2 is "only" 3MB for 32 threads !
They claim that temporal locality is poor in commercial workloads and
latency would be hidden by multithreading but is still sounds relatively
low.
If we compare Niagara with Intel Montecito 1.72 billion transistor
monster also described in that issue of IEEE Micro, the Niagara wons in
terms of clean and nice architecture. For 60W you can have 32 threads on
Niagara or 4 threads on Montecito for 100+W. On the other hand, the
Montecito is also targetted to technical workloads where Niagara can not
compete.
Anyway, the market will decide. I have a bad feeling that clean and nice
architectures are not successful on the market which does not sound good
for Niagara :-)
Milos Becvar |
|
| Back to top |
|
 |
Nick Maclaren
Guest
|
Posted:
Mon Sep 26, 2005 8:15 am Post subject:
Re: What do you think of Sun's Niagara |
|
|
In article <dh70cc$11jg$1@ns.felk.cvut.cz>,
Milos Becvar <becvarm@fel.cvut.cz> wrote:
| Quote: |
There is relatively detailed description of Niagara architecture
in March-April 2005 issue of IEEE Micro.
|
Thanks for the reminder. It wasn't in when I last looked. I must
take another look.
Regards,
Nick Maclaren. |
|
| Back to top |
|
 |
Graeme Gill
Guest
|
Posted:
Mon Sep 26, 2005 8:15 am Post subject:
Re: What do you think of Sun's Niagara |
|
|
Milos Becvar wrote:
| Quote: | There is relatively detailed description of Niagara architecture
in March-April 2005 issue of IEEE Micro.
|
There was some interesting stuff in the May-June issue by
Chaudhry et. al. of Sun, under "High-Performance Throughput Computing"
as well. The "hardware scout" aimed at improving single
threaded performance seemed an intriguing alternative to
a lot of the complexity in other processors.
Graeme Gill.
--
Reply to graeme_not@argyllcms_not.com_not (remove three _not's from the address) |
|
| Back to top |
|
 |
|
|
|
|