Auto Parallelization
CASTalk.com Forum Index CASTalk.com
Discussion of DSP, FPGA, storage and embedded system.
 
 FAQFAQ   MemberlistMemberlist     RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 
Google
 
Web castalk.com
Auto Parallelization
Goto page Previous  1, 2
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture
Author Message
Joachim Worringen
Guest





Posted: Fri Dec 23, 2005 9:15 am    Post subject: Re: Auto Parallelization Reply with quote

Greg Lindahl schrieb:
Quote:
In article <40psofF1bnqiqU2@individual.net>,
Jan Vorbrüggen <jvorbrueggen-not@mediasec.de> wrote:


and I haven't heard yet that
profile-based feedback is used to drive parallelization, although that
might have happened before to some degree.

It's a standard feature -- the main feedback is whether or not a loop
has enough work to make parallelization worthwhile.

It's a little bit different in this case as the compiler does not only
look at loops, but also on arbitrary branches. Supported by the
hardware, it executes multiple branches in parallel and stores the
respective data in private memory buffers. It's sort of extreme
speculative execution.

However, this development mostly targets mostly embedded, mobile
systems, and not HPC. The previous generation of this multi-core
hardware will appear in cell phones next year, IIRC.

Joachim
Back to top
Joachim Worringen
Guest





Posted: Fri Dec 23, 2005 9:15 am    Post subject: Re: Auto Parallelization Reply with quote

Paul Gotch schrieb:
Quote:
I suspect it's not quite that. NEC have been researching coarse grain
speculative threading for years. The idea being that you speculatively
execute ahead of branches on both execution paths then cancel the incorrect
thread as soon as you know if the branch was actually taken or not.

NEC were looking at implementing an existing architecture using this
technique but they obviously came to the conclusion that they could get
better performance with some compiler support probably to add hinting
instructions. The same way as you can get better usage of caches with
carefully used prefetch instructions, which architecturally are just NOPs.

I suspect NEC have combined this with conventional feedback directed
optimisation such that you don't execute down both paths of all branches,
only ones you can't predict with high certainty, and also doing conventional
autovectorisation of some loops to vector units.

Fully correct, apart from the fact that this technique is (not yet)
applied to SX vector CPUs, but to multi-core CPUs for embedded
applications. Probably a bigger market...

Joachim
Back to top
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




VoIP Electronics Powered by phpBB