| Author |
Message |
harish b
Guest
|
Posted:
Sun Nov 28, 2004 11:29 am Post subject:
How code runs in a CMP? |
|
|
hi
i have a basic question on how a binary code runs in a CMP (dual
core) machine.
If I write a single thread application which has large fine grained
parallelism, how is the load shared between the 2 cores?
Whats the role of the compiler and OS in this.
thanks..
hari |
|
| Back to top |
|
 |
Patrick Schaaf
Guest
|
Posted:
Sun Nov 28, 2004 12:01 pm Post subject:
Re: How code runs in a CMP? |
|
|
b__harish@hotmail.com (harish b) writes:
| Quote: | i have a basic question on how a binary code runs in a CMP (dual
core) machine.
If I write a single thread application which has large fine grained
parallelism, how is the load shared between the 2 cores?
|
Not at all.
For all CMP I've read about, you see exactly the same thing as for SMP:
multiple CPUs coherently sharing memory.
| Quote: | Whats the role of the compiler and OS in this.
|
A suitably bright compiler could possibly automatically turn your
single threaded application into multiple threads, which could
help a lot, or nothing, depending on whether that would really
work out in practise for your code.
best regards
Patrick |
|
| Back to top |
|
 |
Mitch Alsup
Guest
|
Posted:
Mon Nov 29, 2004 12:12 am Post subject:
Re: How code runs in a CMP? |
|
|
b__harish@hotmail.com (harish b) wrote in message news:<b2c56ede.0411272229.22c303d@posting.google.com>...
| Quote: | hi
i have a basic question on how a binary code runs in a CMP (dual
core) machine.
If I write a single thread application which has large fine grained
parallelism, how is the load shared between the 2 cores?
|
That Application runs on one CPU, however, the rest of the OS, drivers,
and other background activities can run on the other CPU, so you may notice
better hand/eye coordination on the CMP.
| Quote: | Whats the role of the compiler and OS in this.
|
All modern OSs run in multitasking/multiprogramming style. The compiler
(and run timelibraries) only get involved if/when the application accesses
the multithreading/multiprogramming library 'calls'.
Mitch |
|
| Back to top |
|
 |
Randy
Guest
|
Posted:
Mon Nov 29, 2004 11:24 pm Post subject:
Re: How code runs in a CMP? |
|
|
harish b wrote:
| Quote: | hi
i have a basic question on how a binary code runs in a CMP (dual
core) machine.
If I write a single thread application which has large fine grained
parallelism, how is the load shared between the 2 cores?
Whats the role of the compiler and OS in this.
thanks..
hari
|
AFAIK, are two kinds of multicore CPUs -- Type I, those that process a single
instruction stream, and Type II, those that process a different instruction
stream for each core. (Type I and Type II are nonstandard designations that I'm
inventing only for pedagogy.)
Within most multicore CPUs (Type II), each core in the CPU processes a separate
instruction stream, usually with each arising from a different O/S process. In
short, each core has its own instruction decoder. This type of CPU may as well
be implemented as a dual CPU SMP on a single motherboard. In fact, most Type II
multicore CPUs seem to have been developed only as a way to reduce the cost of
producing a dual CPU node in a server or cluster. The processor technology is
not meaningfully different from what's been done for years.
However, some multicore CPUs (Type I) can execute a single instruction stream
using more than one core. The IBM POWER5 and the HP 9000 (if that's what it was
finally called), actually splits a single instruction stream into independent
operations and runs executes them concurrently on different cores (and their
functional units). In short, both cores share a single instruction decoder. I
assume a single shared front end decodes and schedules the instruction stream to
drive the back end of both cores (separate pipelines, I think). This is *not*
the same technology that's been done for years, and it's more interesting (to me
anyway) for that reason.
Both multicore architectures will run multitasking O/Ses. Generally speaking, a
Type II multicore splits its process streams along an O/S process boundary,
running a different process on each core. When a process is composed of
threads, a process's thread may be allocated onto each core, which is the only
way that type II multicores can concurrently execute a single process.
To complicate multicore topic further, some single core CPUs also employ
hyperthreading (AKA Simultaneous Subordinate Microthreading or SSMT), which is
somewhat similar to what Type I multicores do -- splitting up the instruction
stream to run concurrently on the CPU's resources. In the case of
hyperthreading, the intent is either to execute a source code conditional's
alternative instruction path, or to allow a second (sub)stream of instructions
to execute while an earlier substream is blocked and waiting for a resource to
become available (e.g. memory, network, or a peripheral device).
Do any Type I multicore CPUs also do hyperthreading? I don't know, but I doubt
it. I suspect it would be redundant, since they're already doing something
similar already.
Randy
--
Randy Crawford http://www.ruf.rice.edu/~rand rand AT rice DOT edu |
|
| Back to top |
|
 |
Daniel Gustafsson
Guest
|
Posted:
Tue Nov 30, 2004 12:41 am Post subject:
Re: How code runs in a CMP? |
|
|
"Randy" <joe@burgershack.com> wrote in message
news:cofpgf$q2r$1@joe.rice.edu...
| Quote: | harish b wrote:
hi
i have a basic question on how a binary code runs in a CMP (dual
core) machine.
If I write a single thread application which has large fine grained
parallelism, how is the load shared between the 2 cores?
Whats the role of the compiler and OS in this.
thanks..
hari
AFAIK, are two kinds of multicore CPUs -- Type I, those that process a
single
instruction stream, and Type II, those that process a different
instruction
stream for each core. (Type I and Type II are nonstandard designations
that I'm
inventing only for pedagogy.)
Within most multicore CPUs (Type II), each core in the CPU processes a
separate
instruction stream, usually with each arising from a different O/S
process. In
short, each core has its own instruction decoder. This type of CPU may as
well
be implemented as a dual CPU SMP on a single motherboard. In fact, most
Type II
multicore CPUs seem to have been developed only as a way to reduce the
cost of
producing a dual CPU node in a server or cluster. The processor
technology is
not meaningfully different from what's been done for years.
However, some multicore CPUs (Type I) can execute a single instruction
stream
using more than one core. The IBM POWER5 and the HP 9000 (if that's what
it was
finally called), actually splits a single instruction stream into
independent
operations and runs executes them concurrently on different cores (and
their
functional units). In short, both cores share a single instruction
decoder. I
assume a single shared front end decodes and schedules the instruction
stream to
drive the back end of both cores (separate pipelines, I think). This is
*not*
the same technology that's been done for years, and it's more interesting
(to me
anyway) for that reason.
Both multicore architectures will run multitasking O/Ses. Generally
speaking, a
Type II multicore splits its process streams along an O/S process
boundary,
running a different process on each core. When a process is composed of
threads, a process's thread may be allocated onto each core, which is the
only
way that type II multicores can concurrently execute a single process.
To complicate multicore topic further, some single core CPUs also employ
hyperthreading (AKA Simultaneous Subordinate Microthreading or SSMT),
which is
somewhat similar to what Type I multicores do -- splitting up the
instruction
stream to run concurrently on the CPU's resources. In the case of
hyperthreading, the intent is either to execute a source code
conditional's
alternative instruction path, or to allow a second (sub)stream of
instructions
to execute while an earlier substream is blocked and waiting for a
resource to
become available (e.g. memory, network, or a peripheral device).
Do any Type I multicore CPUs also do hyperthreading? I don't know, but I
doubt
it. I suspect it would be redundant, since they're already doing
something
similar already.
Randy
--
Randy Crawford http://www.ruf.rice.edu/~rand rand AT rice DOT edu
|
I don't know if you are subtly trying to plant a hoax in the original
posters possible homework/assignment or if you are confused on how POWER5
works yourself ? ;)
/Daniel |
|
| Back to top |
|
 |
Randy
Guest
|
Posted:
Tue Nov 30, 2004 2:47 am Post subject:
Re: How code runs in a CMP? |
|
|
Daniel Gustafsson wrote:
| Quote: | "Randy" <joe@burgershack.com> wrote in message
news:cofpgf$q2r$1@joe.rice.edu...
harish b wrote:
....
I don't know if you are subtly trying to plant a hoax in the original
posters possible homework/assignment or if you are confused on how POWER5
works yourself ? ;)
/Daniel
|
Feel free to elucidate.
Randy
--
Randy Crawford http://www.ruf.rice.edu/~rand rand AT rice DOT edu |
|
| Back to top |
|
 |
Daniel Gustafsson
Guest
|
Posted:
Tue Nov 30, 2004 3:17 am Post subject:
Re: How code runs in a CMP? |
|
|
"Randy" <joe@burgershack.com> wrote in message
news:cog5ct$9ch$1@joe.rice.edu...
| Quote: | Daniel Gustafsson wrote:
"Randy" <joe@burgershack.com> wrote in message
news:cofpgf$q2r$1@joe.rice.edu...
harish b wrote:
...
I don't know if you are subtly trying to plant a hoax in the original
posters possible homework/assignment or if you are confused on how
POWER5
works yourself ? ;)
/Daniel
Feel free to elucidate.
|
I got a guilty conscience after sending my earlier post. I apologize if I
was harsh. I think others are better on explaining these topics, but it
looks a bit like you are mixing Type1/Hyperthreading/SMT with out-of-order
execution. About POWER5, it is not substantially different to the other
current dual-core CPUs (HP PA-8800, Sun UltraSparc IV) in this regard, one
difference is that each core in POWER5 has 2-way SMT.
/Daniel |
|
| Back to top |
|
 |
Randy
Guest
|
Posted:
Tue Nov 30, 2004 4:32 am Post subject:
Re: How code runs in a CMP? |
|
|
Daniel Gustafsson wrote:
| Quote: | "Randy" <joe@burgershack.com> wrote in message
news:cog5ct$9ch$1@joe.rice.edu...
Daniel Gustafsson wrote:
....
I don't know if you are subtly trying to plant a hoax in the original
posters possible homework/assignment or if you are confused on how
POWER5 works yourself ? ;)
/Daniel
Feel free to elucidate.
I got a guilty conscience after sending my earlier post. I apologize if I
was harsh. I think others are better on explaining these topics, but it
looks a bit like you are mixing Type1/Hyperthreading/SMT with out-of-order
execution. About POWER5, it is not substantially different to the other
current dual-core CPUs (HP PA-8800, Sun UltraSparc IV) in this regard, one
difference is that each core in POWER5 has 2-way SMT.
/Daniel
|
Hmm. It was my impression that the POWER5 and HP 8800 were more than just a
repackage which simply bonded together two of the the prior generation CPUs
(like what's done to make the Sun UltraSPARC IV and next-gen Opteron).
BBBBBBBUUUUUUUUTTTTTTT...
Upon further investigation, it looks like I was wrong. All dual core CPUs seem
to be just a repackaging of two CPUs into one chipset, without any real change
to the microarchitecture of either core (except perhaps to fiddle with caches or
support SMP), much less a novel centralized coordination of both. All threading
appears to be done either explicitly at O/S level (via pthreads or a JVM) or
implicitly within a core (via hyperthreading or SSMT). Nothing fancier than that.
In the immortal words of Emily Latella, "Oh... Never mind."
Randy
--
Randy Crawford http://www.ruf.rice.edu/~rand rand AT rice DOT edu |
|
| Back to top |
|
 |
harish b
Guest
|
Posted:
Tue Nov 30, 2004 7:26 am Post subject:
Re: How code runs in a CMP? |
|
|
"Daniel Gustafsson" <daniel@mimer.se> wrote in message news:<aYKqd.122956$dP1.433647@newsc.telia.net>...
| Quote: | "Randy" <joe@burgershack.com> wrote in message
news:cofpgf$q2r$1@joe.rice.edu...
harish b wrote:
hi
i have a basic question on how a binary code runs in a CMP (dual
core) machine.
If I write a single thread application which has large fine grained
parallelism, how is the load shared between the 2 cores?
Whats the role of the compiler and OS in this.
thanks..
hari
AFAIK, are two kinds of multicore CPUs -- Type I, those that process a
single
instruction stream, and Type II, those that process a different
instruction
stream for each core. (Type I and Type II are nonstandard designations
that I'm
inventing only for pedagogy.)
Within most multicore CPUs (Type II), each core in the CPU processes a
separate
instruction stream, usually with each arising from a different O/S
process. In
short, each core has its own instruction decoder. This type of CPU may as
well
be implemented as a dual CPU SMP on a single motherboard. In fact, most
Type II
multicore CPUs seem to have been developed only as a way to reduce the
cost of
producing a dual CPU node in a server or cluster. The processor
technology is
not meaningfully different from what's been done for years.
However, some multicore CPUs (Type I) can execute a single instruction
stream
using more than one core. The IBM POWER5 and the HP 9000 (if that's what
it was
finally called), actually splits a single instruction stream into
independent
operations and runs executes them concurrently on different cores (and
their
functional units). In short, both cores share a single instruction
decoder. I
assume a single shared front end decodes and schedules the instruction
stream to
drive the back end of both cores (separate pipelines, I think). This is
*not*
the same technology that's been done for years, and it's more interesting
(to me
anyway) for that reason.
Both multicore architectures will run multitasking O/Ses. Generally
speaking, a
Type II multicore splits its process streams along an O/S process
boundary,
running a different process on each core. When a process is composed of
threads, a process's thread may be allocated onto each core, which is the
only
way that type II multicores can concurrently execute a single process.
To complicate multicore topic further, some single core CPUs also employ
hyperthreading (AKA Simultaneous Subordinate Microthreading or SSMT),
which is
somewhat similar to what Type I multicores do -- splitting up the
instruction
stream to run concurrently on the CPU's resources. In the case of
hyperthreading, the intent is either to execute a source code
conditional's
alternative instruction path, or to allow a second (sub)stream of
instructions
to execute while an earlier substream is blocked and waiting for a
resource to
become available (e.g. memory, network, or a peripheral device).
Do any Type I multicore CPUs also do hyperthreading? I don't know, but I
doubt
it. I suspect it would be redundant, since they're already doing
something
similar already.
- if 2 processors share a single decoder and hence a single |
instruction input queue, wouldnt it be just similar to superscalar ?
- looked up to see power5 is a dual core and also implements SMT aka
hyperthreading (with dynamic resource mgmt).
looks type II than type 1 to me.
- would be interested to know if any design based on Type 1 exists?
| Quote: | Randy
--
Randy Crawford http://www.ruf.rice.edu/~rand rand AT rice DOT edu
I don't know if you are subtly trying to plant a hoax in the original
posters possible homework/assignment or if you are confused on how POWER5
works yourself ? ;)
/Daniel
|
I can assure it is not a homework question :) |
|
| Back to top |
|
 |
Randy Crawford
Guest
|
Posted:
Tue Nov 30, 2004 12:52 pm Post subject:
Re: How code runs in a CMP? |
|
|
harish b wrote:
| Quote: | "Randy" <joe@burgershack.com> wrote in message
news:cofpgf$q2r$1@joe.rice.edu...
....
Do any Type I multicore CPUs also do hyperthreading? I don't know, but I
doubt it. I suspect it would be redundant, since they're already doing
something similar already.
- if 2 processors share a single decoder and hence a single
instruction input queue, wouldnt it be just similar to superscalar ?
- looked up to see power5 is a dual core and also implements SMT aka
hyperthreading (with dynamic resource mgmt).
looks type II than type 1 to me.
- would be interested to know if any design based on Type 1 exists?
|
First of all, ignore what I said before. I am not aware of any "Type I"
multicore CPUs as I described them. Frankly, that makes sense, given
the major changes to a microarchitecture that would be required to
introduce such a chip vs. what did happen -- the repackaging of a pair
of existing CPUs together into one chip.
Yes, a Type I CPU would be similar to superscalar, which is not
surprising since it would be composed of two superscalar RISC CPUs.
(Would that be called "Super-Duper-Scalar"?) In fact, it'd be somewhat
comparable to the custom SoC (System on Chip) processors out there (e.g.
Tensilica or ST Micro or LSI Logic), all of which are in-order and many
of which are VLIW (I think). But it'd be a LOT more sophisticated than
any SoC that I know. (Which admittedly isn't saying much.)
I was just thought it would have been interesting to see such a thing
happen to an out-of-order RISC CPU, which I mistakenly thought had
happened. Que sera.
Randy
--
Randy Crawford http://www.ruf.rice.edu/~rand rand AT rice DOT edu |
|
| Back to top |
|
 |
Alex Colvin
Guest
|
Posted:
Wed Dec 01, 2004 1:32 pm Post subject:
Re: How code runs in a CMP? |
|
|
| Quote: | First of all, ignore what I said before. I am not aware of any "Type I"
multicore CPUs as I described them. Frankly, that makes sense, given
the major changes to a microarchitecture that would be required to
introduce such a chip vs. what did happen -- the repackaging of a pair
of existing CPUs together into one chip.
|
How about SIMD machines - a gang of processors all fed by a single
instruction stream (Illiac IV, Connection Machine 2). They were
a hot topic at one point, though the trend shifted to SPMD,
where they all have the same program, but aren't so tightly synchronized.
--
mac the naïf |
|
| Back to top |
|
 |
David Kanter
Guest
|
Posted:
Thu Dec 02, 2004 2:52 am Post subject:
Re: How code runs in a CMP? |
|
|
Alex Colvin <alexc@TheWorld.com> wrote in message news:<cokh4l$ll8$1@pcls4.std.com>...
| Quote: | First of all, ignore what I said before. I am not aware of any "Type I"
multicore CPUs as I described them. Frankly, that makes sense, given
the major changes to a microarchitecture that would be required to
introduce such a chip vs. what did happen -- the repackaging of a pair
of existing CPUs together into one chip.
How about SIMD machines - a gang of processors all fed by a single
instruction stream (Illiac IV, Connection Machine 2). They were
a hot topic at one point, though the trend shifted to SPMD,
where they all have the same program, but aren't so tightly synchronized.
|
Have you read about multiscalar CPUs or trace processors? They are
coming back as research ideas and depending on how the mythical "cell"
processor works, they might be shipping commercially.
David |
|
| Back to top |
|
 |
Del Cecchi
Guest
|
Posted:
Thu Dec 02, 2004 3:04 am Post subject:
Re: How code runs in a CMP? |
|
|
"David Kanter" <dkanter@gmail.com> wrote in message
news:745d25e.0412011352.69c2c657@posting.google.com...
| Quote: | Alex Colvin <alexc@TheWorld.com> wrote in message
news:<cokh4l$ll8$1@pcls4.std.com>...
First of all, ignore what I said before. I am not aware of any "Type
I"
multicore CPUs as I described them. Frankly, that makes sense, given
the major changes to a microarchitecture that would be required to
introduce such a chip vs. what did happen -- the repackaging of a pair
of existing CPUs together into one chip.
How about SIMD machines - a gang of processors all fed by a single
instruction stream (Illiac IV, Connection Machine 2). They were
a hot topic at one point, though the trend shifted to SPMD,
where they all have the same program, but aren't so tightly
synchronized.
Have you read about multiscalar CPUs or trace processors? They are
coming back as research ideas and depending on how the mythical "cell"
processor works, they might be shipping commercially.
David
|
Mythical cell processor? What is that supposed to mean? maybe you ought to
be in SanFrancisco on Feb 8.
del cecchi |
|
| Back to top |
|
 |
|
|
|
|