Running out of speed on PC-based systems
CASTalk.com Forum Index CASTalk.com
Discussion of DSP, FPGA, storage and embedded system.
 
 FAQFAQ   MemberlistMemberlist     RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 
Google
 
Web castalk.com
Running out of speed on PC-based systems

 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture
Author Message
A.G.McDowell
Guest





Posted: Fri Feb 04, 2005 2:51 am    Post subject: Running out of speed on PC-based systems Reply with quote

I hope that the following real life situation will provide an example of
multithreading and demanding applications, as well as hopefully
eliciting some advice for me.

I have been using VTune to poke around a large 'real-time' simulation
system (that is, simulated time = elapsed time to within about 1/10th
second) that is closer to its contractual performance margins than we
would like. The original performance estimates assumed a growth in the
performance of commodity PCs that has not come to pass. The simulation
framework makes it hard (but not impossible) to parallelise the
application within a single PC, but multithreading and multiple shared-
memory cpus may not be the answer. Adding new threads that model other
parts of the system slows down the bottleneck thread even though we
appear to have plenty of available cpu parallelism (a 2-cpu Xeon gives
us 4 virtual cpus, and we appear to have enough total work for about 1.6
of them). VTune shows that cycles per instruction in the bottleneck
thread goes up by 30% in these circumstances, so our theory is that we
are running out of memory bandwidth. Off-loading some work to a separate
machine reclaims some of that 30% performance loss, even though there is
then substantial TCP traffic from one to the other, but further
increases in model complexity start hitting the bottleneck thread again.

The bottleneck thread is simulating a Sparc relative, using software not
under our control. It is responsible for anything from 60-90% of the cpu
consumption, depending on the scope of the model, and what it is being
asked to do. FWIW, the idle loop of the simulated software runs a memory
cleaning task, working its way systematically through the simulated
memory to guard against single-bit errors, and the simulator is more
than just a software Sparc: it is capable of producing a variety of
error and failure conditions on demand.

We are currently running this on a top-line Dell server. I think this
means 2 physical Xeon chips at 3.6GHz with an 800MHz FSB and 400MHz
memory.

Is it plausible that such a system would be bottlenecking on memory
bandwidth, rather than cpu?

Is memory bandwidth running ahead of or behind cpu performance?

Only a small number of such systems will run, so there is a good
argument for throwing money at hardware, rather than software. Are there
niche PC makers out there that could give us a 50-100% increase - no
doubt for a price? We are not happy about overclocking, but do there
exist fast memory subsystems that nevertheless stay within the
manufacturers recommended operating conditions?
--
A.G.McDowell
Back to top
Greg Lindahl
Guest





Posted: Fri Feb 04, 2005 3:28 am    Post subject: Re: Running out of speed on PC-based systems Reply with quote

In article <JAzicKApzpACFwCy@mcdowella.demon.co.uk>,
A.G.McDowell <mcdowella@mcdowella.demon.co.uk> wrote:

Quote:
We are currently running this on a top-line Dell server. I think this
means 2 physical Xeon chips at 3.6GHz with an 800MHz FSB and 400MHz
memory.

It would be worth your while to go look at it running on an
Opteron-based machine, which has considerably faster memory, as well
as much higher bandwidth to memory on SMP systems, if the OS can place
your pages correctly.

Quote:
Is it plausible that such a system would be bottlenecking on memory
bandwidth, rather than cpu?

Yes, or memory latency.

Quote:
Is memory bandwidth running ahead of or behind cpu performance?

Behind, generally. Ditto for memory latency. This is why it's called
"the memory wall".

-- greg
Back to top
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture All times are GMT
Page 1 of 1

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




VoIP Electronics Powered by phpBB