CAS and LL/SC (was Re: High Level Assembler for MVS & VM & V
CASTalk.com Forum Index CASTalk.com
Discussion of DSP, FPGA, storage and embedded system.
 
 FAQFAQ   MemberlistMemberlist     RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 
Google
 
Web castalk.com
CAS and LL/SC (was Re: High Level Assembler for MVS & VM & V
Goto page 1, 2, 3 ... 24, 25, 26  Next
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture
Author Message
Eric Smith
Guest





Posted: Wed Dec 15, 2004 4:30 am    Post subject: CAS and LL/SC (was Re: High Level Assembler for MVS & VM & V Reply with quote

Anne & Lynn Wheeler <lynn@garlic.com> writes:
Quote:
charlie invented compared-and-swap while working on fine-grain
locking for cp/67 360/67 smp at the science center
http://www.garlic.com/~lynn/subtopic.html#545tech

Interesting. Other than 360 and derivatives, some of the Motorola
68K processors (starting with MC68020), and later x86 processors, what
architectures have included CAS?

Who invented the Load Locked and Store Conditional instructions used
in the MIPS, Power, and Alpha architectures, and where else have they
been used?

Eric

[Originally posted to alt.folklore.computers and bit.listserv.ibm-main,
but seems relevant here. Edited slightly, as I realized after the
original posting that recent x86 processors also have the equivalent of
CAS.]
Back to top
(John Mashey)
Guest





Posted: Wed Dec 15, 2004 4:40 am    Post subject: Re: CAS and LL/SC (was Re: High Level Assembler for MVS & VM Reply with quote

LL/SC: Livermore S-1.
Maybe there was something earlier thart had it, but this is where I
thought it came from.

http://www.cs.clemson.edu/~mark/s1_alumni.html lists some of the
alunmi, of which a bunch worked at MIPS at some point. I especially
recall Earl Killian being involved with this.

MIPS R2000 (MIPS-I) didn't have any synchronization ops on purpose,
because every mechanism we knew cauased at least some customer to tell
us why it wasn't good enough :-) LL/SC was added in MIPS-II to get
minimal operations able to synthesize a lot of people's favorite ones,
and by then we felt we knew much better what people needed.
Back to top
Nick Maclaren
Guest





Posted: Wed Dec 15, 2004 3:36 pm    Post subject: Re: CAS and LL/SC (was Re: High Level Assembler for MVS & VM Reply with quote

In article <1103068097.926843.101230@c13g2000cwb.googlegroups.com>,
(John Mashey) <old_systems_guy@yahoo.com> wrote:
Quote:

MIPS R2000 (MIPS-I) didn't have any synchronization ops on purpose,
because every mechanism we knew cauased at least some customer to tell
us why it wasn't good enough :-) LL/SC was added in MIPS-II to get
minimal operations able to synthesize a lot of people's favorite ones,
and by then we felt we knew much better what people needed.

Interesting. The lack of such instructions effectively means that
shared-memory, parallel-thread applications are unsupported, but that
was not a serious issue then. Nor was the scalability impact on the
kernel.

What I can witness (for SGI, IBM and Sun, having not used HP SMP
systems) is adding those operations is NOT enough to enable practical
use of highly parallel, shared-memory communication applications
(such as most OpenMP ones). There is also a need for more hardware
support of within-process, application controllable thread scheduling,
because the cost of having to use the kernel scheduler is too high.

Some of this could be resolved by improved operating system interfaces,
but it is unclear whether they would be good enough. There was a lot
of theoretical work on this some 25 years ago, but I don't know if
the topic is still active. It isn't easy, in a general computing
context.

A lot of the point here is that far too many of the people working
on shared memory application support (from the hardware up to the
application developers) seem to make the implicit assumption that
each thread has its own private CPU. Even a 0.1% chance of flow of
control being interrupted is important if it can lead to a 1,000x
slowdown.

When attempting performance analysis and tuning of OpenMP codes, I
feel like someone attempting to work out why a mechanical watch isn't
keeping time, armed only with standard domestic metalworking tools.


Regards,
Nick Maclaren.
Back to top
Seongbae Park
Guest





Posted: Wed Dec 15, 2004 6:30 pm    Post subject: Re: CAS and LL/SC (was Re: High Level Assembler for MVS & VM Reply with quote

Maciej W. Rozycki <macro@linux-mips.org> wrote:
Quote:
On Wed, 15 Dec 2004, Nick Maclaren wrote:

MIPS R2000 (MIPS-I) didn't have any synchronization ops on purpose,
because every mechanism we knew cauased at least some customer to tell
us why it wasn't good enough :-) LL/SC was added in MIPS-II to get
minimal operations able to synthesize a lot of people's favorite ones,
and by then we felt we knew much better what people needed.

Interesting. The lack of such instructions effectively means that
shared-memory, parallel-thread applications are unsupported, but that
was not a serious issue then. Nor was the scalability impact on the
kernel.

Note that as long as you go UP this can be handled at the OS level. For
example Linux handles user mode RI traps on LL and SC when run on MIPS-I
processors and emulates the instructions. There's a considerable
performance loss, of course, but the semantics of these operations is
preserved.

Maciej

This doesn't make sense to me
- unless you stop all other threads between LL/SC and/or mask interrupt,
the semantics can not be preserved.
How do you guarantee there will be no intervening store
to the same location from other processor ?
--
#pragma ident "Seongbae Park, compiler, http://blogs.sun.com/seongbae/"
Back to top
ando_san
Guest





Posted: Wed Dec 15, 2004 6:50 pm    Post subject: Re: CAS and LL/SC (was Re: High Level Assembler for MVS & VM Reply with quote

SPARC V9 has CAS too. Both Sun's and Fujitsu's SPARC V9 processors have it
implemented.

Ando_san

"Eric Smith" <eric-no-spam-for-me@brouhaha.com> wrote in message
news:qhwtvko3tq.fsf@ruckus.brouhaha.com...
Quote:
Anne & Lynn Wheeler <lynn@garlic.com> writes:
charlie invented compared-and-swap while working on fine-grain
locking for cp/67 360/67 smp at the science center
http://www.garlic.com/~lynn/subtopic.html#545tech

Interesting. Other than 360 and derivatives, some of the Motorola
68K processors (starting with MC68020), and later x86 processors, what
architectures have included CAS?

Who invented the Load Locked and Store Conditional instructions used
in the MIPS, Power, and Alpha architectures, and where else have they
been used?

Eric

[Originally posted to alt.folklore.computers and bit.listserv.ibm-main,
but seems relevant here. Edited slightly, as I realized after the
original posting that recent x86 processors also have the equivalent of
CAS.]
Back to top
Maciej W. Rozycki
Guest





Posted: Wed Dec 15, 2004 10:43 pm    Post subject: Re: CAS and LL/SC (was Re: High Level Assembler for MVS & VM Reply with quote

On Wed, 15 Dec 2004, Nick Maclaren wrote:

Quote:
MIPS R2000 (MIPS-I) didn't have any synchronization ops on purpose,
because every mechanism we knew cauased at least some customer to tell
us why it wasn't good enough :-) LL/SC was added in MIPS-II to get
minimal operations able to synthesize a lot of people's favorite ones,
and by then we felt we knew much better what people needed.

Interesting. The lack of such instructions effectively means that
shared-memory, parallel-thread applications are unsupported, but that
was not a serious issue then. Nor was the scalability impact on the
kernel.

Note that as long as you go UP this can be handled at the OS level. For
example Linux handles user mode RI traps on LL and SC when run on MIPS-I
processors and emulates the instructions. There's a considerable
performance loss, of course, but the semantics of these operations is
preserved.

Maciej
Back to top
(John Mashey)
Guest





Posted: Wed Dec 15, 2004 11:06 pm    Post subject: Re: CAS and LL/SC (was Re: High Level Assembler for MVS & VM Reply with quote

Weird: I see Nick's post that refers to my post, but that post doesn't
show up for me. This happened to another post in the 128-bit thread.
Has anyone else had this problem?
Back to top
Joe Seigh
Guest





Posted: Thu Dec 16, 2004 12:15 am    Post subject: Re: CAS and LL/SC (was Re: High Level Assembler for MVS & VM Reply with quote

Seongbae Park wrote:
Quote:

Maciej W. Rozycki <macro@linux-mips.org> wrote:
....
Note that as long as you go UP this can be handled at the OS level. For
example Linux handles user mode RI traps on LL and SC when run on MIPS-I
processors and emulates the instructions. There's a considerable
performance loss, of course, but the semantics of these operations is
preserved.

Maciej

This doesn't make sense to me
- unless you stop all other threads between LL/SC and/or mask interrupt,
the semantics can not be preserved.
How do you guarantee there will be no intervening store
to the same location from other processor ?

Write protect the load locked target.

Joe Seigh
Back to top
Joe Seigh
Guest





Posted: Thu Dec 16, 2004 12:36 am    Post subject: Re: CAS and LL/SC (was Re: High Level Assembler for MVS & VM Reply with quote

Nick Maclaren wrote:
Quote:

What I can witness (for SGI, IBM and Sun, having not used HP SMP
systems) is adding those operations is NOT enough to enable practical
use of highly parallel, shared-memory communication applications
(such as most OpenMP ones). There is also a need for more hardware
support of within-process, application controllable thread scheduling,
because the cost of having to use the kernel scheduler is too high.

How is hyperthreading and throughput computing not this already, apart
from the currently limited number of contexts that you can have loaded
at any one time? Even that is probably not a problem.

Joe Seigh
Back to top
Ian Shef
Guest





Posted: Thu Dec 16, 2004 12:59 am    Post subject: Re: CAS and LL/SC (was Re: High Level Assembler for MVS & VM Reply with quote

nmm1@cus.cam.ac.uk (Nick Maclaren) wrote in news:cpp433$s4r$1
@gemini.csx.cam.ac.uk:

Quote:
In article <1103068097.926843.101230@c13g2000cwb.googlegroups.com>,
(John Mashey) <old_systems_guy@yahoo.com> wrote:

MIPS R2000 (MIPS-I) didn't have any synchronization ops on purpose,
because every mechanism we knew cauased at least some customer to tell
us why it wasn't good enough :-) LL/SC was added in MIPS-II to get
minimal operations able to synthesize a lot of people's favorite ones,
and by then we felt we knew much better what people needed.

Interesting. The lack of such instructions effectively means that
shared-memory, parallel-thread applications are unsupported, but that
was not a serious issue then. Nor was the scalability impact on the
kernel.


No, it is still possible to provide locking mechanisms entirely in software
(assuming that the instruction set and other hardware provides certain
minimal capabilities - and MIPS R2000 can meet these conditions). There
are papers (I used to have one by Leslie Lamport, if I remember the name
correctly) that describe how to do this. This is how you were supposed to
perform locking on the MIPS R2000. (At least there was some piece of MIPS
R2000 documentation that provided pointers to papers on software locking
algorithms - that is how I found the papers by Lamport).

The hardware mechanisms save clock cycles, but the software algorithm that
I saw was pretty efficient for the case that the lock is available without
contention.

--
Ian Shef 805/F6 * These are my personal opinions
Raytheon Company * and not those of my employer.
PO Box 11337 *
Tucson, AZ 85734-1337 *
Back to top
Joe Seigh
Guest





Posted: Thu Dec 16, 2004 1:38 am    Post subject: Re: CAS and LL/SC (was Re: High Level Assembler for MVS & VM Reply with quote

Ian Shef wrote:
Quote:

nmm1@cus.cam.ac.uk (Nick Maclaren) wrote in news:cpp433$s4r$1
@gemini.csx.cam.ac.uk:
Interesting. The lack of such instructions effectively means that
shared-memory, parallel-thread applications are unsupported, but that
was not a serious issue then. Nor was the scalability impact on the
kernel.


No, it is still possible to provide locking mechanisms entirely in software
(assuming that the instruction set and other hardware provides certain
minimal capabilities - and MIPS R2000 can meet these conditions). There
are papers (I used to have one by Leslie Lamport, if I remember the name
correctly) that describe how to do this. This is how you were supposed to
perform locking on the MIPS R2000. (At least there was some piece of MIPS
R2000 documentation that provided pointers to papers on software locking
algorithms - that is how I found the papers by Lamport).

The hardware mechanisms save clock cycles, but the software algorithm that
I saw was pretty efficient for the case that the lock is available without
contention.


That's what's called a distributed algorithm which you can do if the memory
model supports it and you have atomic loads and stores. In fact I believe
distributed algorithms like Peterson's algorithm are used for systems with
large numbers of cpus where the interlocked instructions won't scale
very well.

If you are doing this kind of stuff, you have to be prepared to improvise since
you never know what the hardware designers will provide. I used to think it
was pretty obivous that double wide compare and swap was as important as single
wide compare and swap. But not to AMD apparently. So even though lock-free
LIFO stacks have been around longer than I've been programming, which is a long
time, you can't do it on AMD 64 bit processors.

Joe Seigh
Back to top
Nick Maclaren
Guest





Posted: Thu Dec 16, 2004 1:58 am    Post subject: Re: CAS and LL/SC (was Re: High Level Assembler for MVS & VM Reply with quote

Seongbae Park <Seongbae.Park@Sun.COM> wrote:
Maciej W. Rozycki <macro@linux-mips.org> wrote:
Quote:

Interesting. The lack of such instructions effectively means that
shared-memory, parallel-thread applications are unsupported, but that
was not a serious issue then. Nor was the scalability impact on the
kernel.

Note that as long as you go UP this can be handled at the OS level. For
example Linux handles user mode RI traps on LL and SC when run on MIPS-I
processors and emulates the instructions. There's a considerable
performance loss, of course, but the semantics of these operations is
preserved.

This doesn't make sense to me
- unless you stop all other threads between LL/SC and/or mask interrupt,
the semantics can not be preserved.
How do you guarantee there will be no intervening store
to the same location from other processor ?

This confusion is my fault - I didn't explain in enough detail, and
Maciej Rozyck has assumed a different sense of supported to you and me.

There is a level at which semantics is preserved, but there is another
in which it cannot be. For example, there are (correct) codes which
will always terminate if run on a uniprocessor (with plausibly coarse
scheduling) but not necessarily do so if each thread has its own CPU,
and there are (correct) codes which do the converse.

There is also the fact that timing issues ARE part of the semantics if
you are into software engineering, rather than unrealistic computer
science, and approaches that have wildly different relative timings
are not equivalent. So, even if codes will always terminate on both
uniprocessors and separate CPUs, their analysis may be different.
NOT just their performance analysis, but their correctness analysis.


Ian Shef <invalid@avoiding.spam> wrote:
Quote:

No, it is still possible to provide locking mechanisms entirely in software
(assuming that the instruction set and other hardware provides certain
minimal capabilities - and MIPS R2000 can meet these conditions). There
are papers (I used to have one by Leslie Lamport, if I remember the name
correctly) that describe how to do this. This is how you were supposed to
perform locking on the MIPS R2000. (At least there was some piece of MIPS
R2000 documentation that provided pointers to papers on software locking
algorithms - that is how I found the papers by Lamport).

The hardware mechanisms save clock cycles, but the software algorithm that
I saw was pretty efficient for the case that the lock is available without
contention.

See above. The myth you are quoting started in computer science some
30 years ago, and was known to be wrong then. It is true in a very
restrictive model, but that model is of little interest to software
engineers, as it is too unrealistic.

For example, the very complexity of the time for an operation (in the
presence of contention, of course) can be higher for that model than
for ones with better hardware support. And that in turn can make the
difference between a feasible design and an infeasible one.

No competent software engineer would ever analyse an algorithm for
best-case operation alone. Exactly what you do analyse will depend on
the requirement, and "average case under heavy contention" is a very
common target. So is "worst plausible case".


The above is also an answer to Joe Seigh. The executive summary could
be "think scheduling, and the timing of algorithms under heavy and
unfortunate contention."


Regards,
Nick Maclaren.
Back to top
Terje Mathisen
Guest





Posted: Thu Dec 16, 2004 2:15 am    Post subject: Re: CAS and LL/SC (was Re: High Level Assembler for MVS & VM Reply with quote

Ian Shef wrote:
Quote:
No, it is still possible to provide locking mechanisms entirely in software
(assuming that the instruction set and other hardware provides certain
minimal capabilities - and MIPS R2000 can meet these conditions). There
are papers (I used to have one by Leslie Lamport, if I remember the name
correctly) that describe how to do this. This is how you were supposed to
perform locking on the MIPS R2000. (At least there was some piece of MIPS
R2000 documentation that provided pointers to papers on software locking
algorithms - that is how I found the papers by Lamport).

AFAIR, Lamport's algorithm depends on having unlimited range counters!

This does not work for unlimited uptimes, but might be good enough,
particularly on 64-bit machines.

Terje

--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
Back to top
Joe Seigh
Guest





Posted: Thu Dec 16, 2004 4:40 am    Post subject: Re: CAS and LL/SC (was Re: High Level Assembler for MVS & VM Reply with quote

Terje Mathisen wrote:
Quote:

Ian Shef wrote:
No, it is still possible to provide locking mechanisms entirely in software
(assuming that the instruction set and other hardware provides certain
minimal capabilities - and MIPS R2000 can meet these conditions). There
are papers (I used to have one by Leslie Lamport, if I remember the name
correctly) that describe how to do this. This is how you were supposed to
perform locking on the MIPS R2000. (At least there was some piece of MIPS
R2000 documentation that provided pointers to papers on software locking
algorithms - that is how I found the papers by Lamport).

AFAIR, Lamport's algorithm depends on having unlimited range counters!

This does not work for unlimited uptimes, but might be good enough,
particularly on 64-bit machines.


Peterson's algorithm fixes that. It's also a distributed algorithm like
Lamport's.

Joe Seigh
Back to top
Joe Seigh
Guest





Posted: Thu Dec 16, 2004 4:40 am    Post subject: Re: CAS and LL/SC (was Re: High Level Assembler for MVS & VM Reply with quote

John Mashey wrote:

Quote:
Unlike some other areas, I regard this (synchronization operations) as
one of several areas where I'm still not comfortable that the industry
has produced many mechanisms that simultaneously have:
- consistent APIs, especially across vendors
- simple, cheap low-end implementations
- good scaling on large machines

Except that the corresponding synchronization technique has to scale as well.
Having a synchronization primative for mutual exclution that scales well won't
do you any good because mutual exclusion won't scale up.

There are some techniques such as certain lock-free algorithms that do scale
well. But you're not going see hardware changed to take advantage of those
techniques because in general hardware types don't listen to software types
as obviously we don't know what we're doing.

Joe Seigh
Back to top
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture All times are GMT
Goto page 1, 2, 3 ... 24, 25, 26  Next
Page 1 of 26

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




VoIP Electronics Powered by phpBB