Pretty good explanation of x86-64 by HP
CASTalk.com Forum Index CASTalk.com
Discussion of DSP, FPGA, storage and embedded system.
 
 FAQFAQ   MemberlistMemberlist     RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 
Google
 
Web castalk.com
Pretty good explanation of x86-64 by HP
Goto page Previous  1, 2, 3, 4, 5, 6  Next
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture
Author Message
Del Cecchi
Guest





Posted: Tue Dec 07, 2004 7:28 pm    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

"Grumble" <devnull@kma.eu.org> wrote in message
news:cp4djh$vdh$1@news-rocq.inria.fr...
Quote:
Del Cecchi wrote:

What braindamaged newsreader are you using that won't let you right
click the link in the newsreader? Even OE does that. So quit whining
and switch to a decent newsreader.

Speaking of brain-damaged newsreaders, take a look at the mess yours
did when you quoted John's message. I rest my case.

A few lines got wrapped. That what you are talking about?

del
Back to top
Grumble
Guest





Posted: Tue Dec 07, 2004 8:19 pm    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

Del Cecchi wrote:

Quote:
Grumble wrote:

Del Cecchi wrote:

What braindamaged newsreader are you using that won't let you
right click the link in the newsreader? Even OE does that.
So quit whining and switch to a decent newsreader.

Speaking of brain-damaged newsreaders, take a look at the mess
yours did when you quoted John's message. I rest my case.

A few lines got wrapped. That what you are talking about?

Yessir!

Perhaps OE-QuoteFix might help if you must use OE?
Back to top
Guest






Posted: Tue Dec 07, 2004 9:11 pm    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

Yes, it is clear that the memory controller (and the rest of the
NorthBridge) operates at CPU frequency, However, the DRAM controller
operates at DRAM frequency (Address rate).

CPU<->NB<->MC<->DC<->DRAM
Back to top
Greg Lindahl
Guest





Posted: Tue Dec 07, 2004 10:56 pm    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

In article <pan.2004.12.07.01.37.06.417847@att.bizzzz>,
keith <krw@att.bizzzz> wrote:

Quote:
Note that the STREAM bandwidth and lmbench latency changes with every
cpuspeedbump. So clearly part of the memory controller is at the cpu
core frequency, or a related frequency, and not at the HT frequency,
or the SDRAM external bus frequency.

That does *not* mean that the memory corntoller runs at the core speed.
It would be nuts to assume such. Would you assume the cashes of the
PII run at the the I/O bus speed?

"or a related frequency", i.e. based on the cpu frequency with a
constant divider.

Quote:
Please reduce the cross-post. Followups set to a group I read.

Isn't his a rather egotistical statement?

No, it follows Usenet tradition: post only to groups that you read.

But thanks for giving me the benefit of the doubt.

-- greg
Back to top
Eric C. Fromm
Guest





Posted: Wed Dec 08, 2004 12:32 am    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

Janne Blomqvist wrote:

Quote:

By the time dual core Opterons arrive, I suspect that DDR2-800 will
also be available, thus providing twice the memory BW compared to the
current single core offerings using DDR-400.


And unless the HyperTransport channels get faster or more are added for the

dual core chip, non-NUMA aware kernels and applications might not always
see the full benefits of that bandwidth doubling. I also wonder how many
DIMMs can be reliably configured on a DDR2-800 bus. There might well
be a capacity trade off required at those speeds.

--
Eric C. Fromm efromm@sgi.com
Principal Engineer Scalable Systems Division
SGI - Silicon Graphics, Inc. Chippewa Falls, Wi.
Back to top
Greg Lindahl
Guest





Posted: Wed Dec 08, 2004 2:22 am    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

In article <cp50gt$2gvbd1$1@fido.engr.sgi.com>,
Eric C. Fromm <efromm@sgi.com> wrote:

Quote:
And unless the HyperTransport channels get faster or more are added for the
dual core chip, non-NUMA aware kernels and applications might not always
see the full benefits of that bandwidth doubling.

Right. AMD has a roadmap for HT to address this issue. However, there
will be a large number of single-socket systems and systems running
processes that control their locality pretty well (MPI usually falls
into this category) who will all see the full benefit.

IBM had a similar set of issues with Power4 and Power5. They sold
systems with only 1 cpu enabled to address customers who want the most
memory bandwidth. And the inter-cpu links were fast enough that
scaling was pretty good either way.

-- greg
Back to top
John Savard
Guest





Posted: Wed Dec 08, 2004 5:01 am    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

On Mon, 6 Dec 2004 20:16:21 -0600, "del cecchi" <dcecchi.nojunk@att.net>
wrote, in part:

Quote:
What braindamaged newsreader are you using that won't let you right
click the link in the newsreader?

Clicking on the link in the newsreader, supposing I could do that, would
simply cause the link to open in a browser window. Which is exactly what
I achieved by cutting and pasting.

Maybe some newsreaders do allow right-clicking links. Such newsreaders
would probably also do dangerous and reckless things like rendering HTML
posts instead of displaying them in all their <angle bracket> glory.

This could result in having a brain-damaged computer, were I to view the
wrong post by accident.

As the posting in question was a text posting, this means that the
newsreader would have to guess at what constituted an URL, as well, with
no doubt occasional hilarious results.

John Savard
http://home.ecn.ab.ca/~jsavard/index.html
Back to top
David Schwartz
Guest





Posted: Wed Dec 08, 2004 7:14 am    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

"Per Ekman" <pek@pdc.kth.se> wrote in message
news:mjewtvuh6u0.fsf@curlew.pdc.kth.se...

Quote:
Yousuf Khan <bbbl67@ezrs.com> writes:

Actually, there was a story here not so long ago where one of the Linux
distros had been optimized up with NUMA assumptions, and it actually ran
/slower/ than a non-NUMA kernel. In other words the Linux kernel might
have spent more time making complex decisions about memory placement
than it was actually going to save from the latencies.

And the conclusion was that a multi-CPU Opteron system must then be
UMA, rather than that the NUMA "optimizations" were crap?

There is a cost to treating memory as NUMA. The benefit you get in
exchange for that cost is dependent upon how NU the MA is. The point is that
MA on an Opteron system with 2 to 8 processors is so close to U that
treating it in most cases, it's effectively U.

The scaling advantage comes largely from the architecture of a single
processor. The memory controller is on the chip. The main reason this
matters is that it means that local memory accesses don't have to content
with any other inter-CPU or I/O traffic. The other advantage comes from the
number of HT interfaces. Corresponding Intel CPUs have only a single FSB
over which all traffic must flow.

Above 8 processors, things get much more complicated. But it doesn't
seem like there's much of a (mainstream commercial) market at that scaling
level yet.

DS
Back to top
Tony Hill
Guest





Posted: Wed Dec 08, 2004 10:19 am    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

On 06 Dec 2004 14:12:20 +0100, Per Ekman <pek@pdc.kth.se> wrote:

Quote:
Tony Hill <hilla_nospam_20@yahoo.ca> writes:

It does, but the difference is small, usually less than 10% and often
much closer to 0%.

And sometimes 50%...

Sure, there will be extreme cases in everything.

Quote:
Most users don't use their computer to run STREAM though. Even in the
HPC community where memory bandwidth is king, STREAM is still a rather
extreme case.

I admit I'm from the HPC-sector and memory bandwidth is very important
to many applications here.

One thing that you need to keep in mind is that you represent a VERY
small minority here in terms of PC server sales. Just because it
matters to your application probably doesn't have much reference to
the bulk of the buying public, and it almost certainly isn't going to
have implications for what the marketing people write in the trade
rags.

Quote:
Besides, they do recognize that it is NUMA, just that they are saying
you don't NEED to worry about that if you don't want to because for
the vast majority of times the performance difference is lost in the
noise.

It's a pretty strange argument in my eyes, "If you ignore the
applications that run poorly because of property X, then it makes
sense to downplay property X." True, but not helpful if you have such
an application.

Ahh, but it's VERY helpful if you're in the marketing department! :>

In the end, the people that are going to take a performance due to
lack of NUMA optimizations probably already know as much and have
factored it into their buying decisions. The people who are talking
to Dell or HPaq's server sales and are thinking about an Opteron
system but are worried that this here NoooMah thingy might cause their
application to run slow most likely don't have to worry about much.
Hence SUMO.

It's all a matter of perspective.

-------------
Tony Hill
hilla <underscore> 20 <at> yahoo <dot> ca
Back to top
George Macdonald
Guest





Posted: Wed Dec 08, 2004 6:09 pm    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

On Wed, 08 Dec 2004 00:19:59 -0500, Tony Hill <hilla_nospam_20@yahoo.ca>
wrote:

Quote:
On 06 Dec 2004 14:12:20 +0100, Per Ekman <pek@pdc.kth.se> wrote:

Tony Hill <hilla_nospam_20@yahoo.ca> writes:

It does, but the difference is small, usually less than 10% and often
much closer to 0%.

And sometimes 50%...

Sure, there will be extreme cases in everything.

Most users don't use their computer to run STREAM though. Even in the
HPC community where memory bandwidth is king, STREAM is still a rather
extreme case.

I admit I'm from the HPC-sector and memory bandwidth is very important
to many applications here.

One thing that you need to keep in mind is that you represent a VERY
small minority here in terms of PC server sales. Just because it
matters to your application probably doesn't have much reference to
the bulk of the buying public, and it almost certainly isn't going to
have implications for what the marketing people write in the trade
rags.

I think you're underestimating the size of the "workstation" market, which
will include people finding they can migrate down to PC-grade CPUs to
replace old "higher power" systems as well as people on the lower-end
fringe who may have grown their problem complexity beyond a uni-PC, or who
*could* get by with a fastish PC but like the comfort of the move up to
dual for future growth. Add them to the current established base of CAD,
engineering and modeling etc. applications and there is a decent sized
market.

There are a lot of mathematical/engineering problems out there which are
just part of everyday business computing - many *used* to be considered HPC
and are now quite routine on desktop sized boxes. In many cases,
proprietary (purchased) software is used and the algorithmic methods are
only understood fairly superficially by the user; what that user wants is
response, whether it's measured in minutes, hours or a day or more. The
software vendor thus feels responsible for supplying the best combination
of software and recommended hardware selection.

Rgds, George Macdonald

"Just because they're paranoid doesn't mean you're not psychotic" - Who, me??
Back to top
keith
Guest





Posted: Thu Dec 09, 2004 9:43 am    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

On Tue, 07 Dec 2004 09:56:44 -0800, Greg Lindahl wrote:

Quote:
In article <pan.2004.12.07.01.37.06.417847@att.bizzzz>,
keith <krw@att.bizzzz> wrote:

Note that the STREAM bandwidth and lmbench latency changes with every
cpuspeedbump. So clearly part of the memory controller is at the cpu
core frequency, or a related frequency, and not at the HT frequency,
or the SDRAM external bus frequency.

That does *not* mean that the memory corntoller runs at the core speed.
It would be nuts to assume such. Would you assume the cashes of the
PII run at the the I/O bus speed?

"or a related frequency", i.e. based on the cpu frequency with a
constant divider.

Ok, how many "unrelated frequencies" are there in a CPU? Let's get real
here.

Quote:
Please reduce the cross-post. Followups set to a group I read.

Isn't his a rather egotistical statement?

No, it follows Usenet tradition: post only to groups that you read.

No, that is *not* Usenet tradition. The tradition is to limit
cross-postings to on-topic newsgroups. Cross-posting is not expensive
(unless you have a dran-bamaged newsreader).

Quote:
But thanks for giving me the benefit of the doubt.

Cutting off your audience, particularly those who *you* have responded to
is rude. Sorry if I've ruffled your feathers!

--
Keith
Back to top
Bernd Paysan
Guest





Posted: Thu Dec 09, 2004 3:15 pm    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

David Schwartz wrote:
Quote:
The scaling advantage comes largely from the architecture of a single
processor. The memory controller is on the chip. The main reason this
matters is that it means that local memory accesses don't have to content
with any other inter-CPU or I/O traffic.

That's only partly true. The Opterons still talk to each other even on local
accesses (coherency tokens only, no real data transfer). This takes both
time and adds to the traffic, since such a token needs to get everywhere.

What's missing here is a "exclusive" bit in the page table, for non-coherent
pages. The OS pretty well knows (or can know) which core is accessing a
page, and for a page that's not shared, the coherency token is not
necessary.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/
Back to top
achish777@cox.net
Guest





Posted: Fri Dec 10, 2004 10:47 am    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

Greg Lindahl wrote:
Quote:
Right. AMD has a roadmap for HT to address this issue.

I recently listened to Fred Weber (CTO of AMD) present at Lehman
Brothers 2004 T4
conference. When he was speaking about AMDs future direction he
mentioned HyperTransport 3 and said it would be 5 Gigatransfers/second
and higher. It sounds like alot, but I'm still doing my research to
find out what exactly a "Gigatransfer" is :).

p.s - I noticed that the new HTX standard is speced at 1.8
Gigatransfers/sec. With that number and your new Infinipath adapter
Pathscale showed some impressive MPI latency numbers, it seems its only
going to get much better with HT 3.

Regards,
Garius
Back to top
Del Cecchi
Guest





Posted: Fri Dec 10, 2004 7:25 pm    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

achish777@cox.net wrote:
Quote:
Greg Lindahl wrote:

Right. AMD has a roadmap for HT to address this issue.


I recently listened to Fred Weber (CTO of AMD) present at Lehman
Brothers 2004 T4
conference. When he was speaking about AMDs future direction he
mentioned HyperTransport 3 and said it would be 5 Gigatransfers/second
and higher. It sounds like alot, but I'm still doing my research to
find out what exactly a "Gigatransfer" is :).

p.s - I noticed that the new HTX standard is speced at 1.8
Gigatransfers/sec. With that number and your new Infinipath adapter
Pathscale showed some impressive MPI latency numbers, it seems its only
going to get much better with HT 3.

Regards,
Garius


A Gigatransfer/s is 10**9 bits per pin or pin pair. It removes the
ambiguity when discussing links whose width is variable, like HT and
many others.

HT has released specifications for transfer rates to 2.4 GT/s.
Back to top
Keith R. Williams
Guest





Posted: Fri Dec 10, 2004 7:40 pm    Post subject: Re: Pretty good explanation of x86-64 by HP Reply with quote

In article <31tpvbF390ri8U1@individual.net>, cecchinospam@us.ibm.com
says...
Quote:
achish777@cox.net wrote:
Greg Lindahl wrote:

Right. AMD has a roadmap for HT to address this issue.


I recently listened to Fred Weber (CTO of AMD) present at Lehman
Brothers 2004 T4
conference. When he was speaking about AMDs future direction he
mentioned HyperTransport 3 and said it would be 5 Gigatransfers/second
and higher. It sounds like alot, but I'm still doing my research to
find out what exactly a "Gigatransfer" is :).

p.s - I noticed that the new HTX standard is speced at 1.8
Gigatransfers/sec. With that number and your new Infinipath adapter
Pathscale showed some impressive MPI latency numbers, it seems its only
going to get much better with HT 3.

Regards,
Garius


A Gigatransfer/s is 10**9 bits per pin or pin pair. It removes the
ambiguity when discussing links whose width is variable, like HT and
many others.

It also eliminates the ambiguity of MHz for DDR (QDR, etc.) transfers.
Quote:

HT has released specifications for transfer rates to 2.4 GT/s.

--

Keith
Back to top
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture All times are GMT
Goto page Previous  1, 2, 3, 4, 5, 6  Next
Page 4 of 6

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




VoIP Electronics Powered by phpBB