| Author |
Message |
Eric P.
Guest
|
Posted:
Tue Jul 05, 2005 4:15 pm Post subject:
Re: How does this make you feel? |
|
|
Steve wrote:
| Quote: |
snip
So, we might suggest 'R1' as a 64-bit register that may take on
additional properties that affect its application with an instruction.
The on-chip logic that accompanies the register includes provisions to
change its behavior (according to my previous suggestion). First,
there is a way to partition its bits into a address and length
component: specified perhaps by a special instruction that sets its
parameters:
rcfg r1, 54:8
... which gives us 54 bits of addressing and eight bits of length.
There has to be a third component, register 'chunk' size, that in
effect dictates the 'word' size of that particular register. In this
hypothetical design, different registers may have different partitions
and 'chunk' sizes. So, we might have a hypothetical register config
instruction like this:
|
This sounds vaguely like the Intel i432 (circa 1982) with its
bits-in-a-segment addressing. You might want to look at it.
Capability-Based Computer Systems
http://www.cs.washington.edu/homes/levy/capabook/
At any rate, are you saying that if set to 54:9 I get a 9 bit machine?
And do all the internal ALUs and other function units reconfigure
to handle 9 bits now?
| Quote: | rcfg r1, 54:8:19
... which gives us a register configured to address 'chunks' of 2^19
bits (65536 bytes), with 54 bits of address space, and 8 bits of
'length' which allows in this case a span of 16M in 'steps' of 64k.
|
I don't get the point of these chuncks. Are chunks like pages?
If so this would seem to imply that you want page size under
user mode application control. (just trying to understand).
Eric |
|
| Back to top |
|
 |
Eric P.
Guest
|
Posted:
Tue Jul 05, 2005 9:38 pm Post subject:
Re: How does this make you feel? |
|
|
Andy Nelson wrote:
| Quote: |
Niels J?rgen Kruse <nospam@ab-katrinedal.dk> wrote:
Andy Nelson <andy@thermo.lanl.gov> wrote:
Or, more precisely, turn on/off the OS support for various page sizes,
and to set the initial distribution of pagesizes (i.e. 30% 64k,
30% 1m, 30% 16m etc etc) when the machine first boots up. The
distribution tends to get scrambled after that, and it doesn't seem
possible to get it back all the way without a full reboot.
Relocating small pages to make room for a big one is not an option? You
would of course pick a candidate big page with few small pages allocated
on it. Some small pages might be candidates for just unmapping, letting
the owner page them in again if they were really useful.
Recall, that I was speaking from a user point of view, not from a systems
or kernel point of view. I don't know what the OS does or can do, I just
know what I was observing when I tried to get the OS to give my job
big pages...after the page size distribution was scrambled, it was never
possible to get it all the way back to any sort of initial state, however
you may define that.
|
I found some good docs detailing the internal OS changes made to
support variable page sizes. IRIX does support page coalescing
called "migration", working pretty much as I outlined. It also has to
split/merge support because it cannot be avoided due to
mmap() & mprotect() abilities.
General Purpose Operating System Support for Multiple Page Sizes
Narayanan Ganapathy and Curt Schimmel
Silicon Graphics Computer Systems, Inc.
www.usenix.org/publications/library/proceedings/usenix98/full_papers/ganapathy/ganapathy.pdf
Another interesting variable page size OS internals paper is:
Implementation of Multiple Pagesize Support in HP-UX
I. Subramanian, C. Mather, K. Peterson, and B. Raghunath
Hewlett-Packard Company
http://citeseer.ist.psu.edu/34088.html
Eric |
|
| Back to top |
|
 |
John Mashey
Guest
|
Posted:
Wed Jul 06, 2005 12:15 am Post subject:
Re: How does this make you feel? |
|
|
Sigh.
Steve wrote:
| Quote: | Jan Vorbrüggen wrote:
C and UNIX originated a while ago. It's not too difficult to see that
their original designers never anticipated certain real-world problem
domains. POSIX .4 is but one example of a feature required of modern
systems that ends up being difficult to implement and use today.
|
Gresham's Law applied to unmoderated newsgroups:
Gresham's Law is "Bad money drives good money out of circulation".
In comp.arch, the % of (vague, ill-informed, confusing, or outright
wrong statements) continues to rise, reducing the group's SNR (Signal
Noise Ratio). Some threads start with N, and never recover, but even
threads that start with S get drowned with N ... and then less and less
people capable of contributing S have the patience to do so, and so the
SNR gets even worse, and even more confusing for people who actually
try to learn.
For the quoted paragraph above, here's a replacement:
C appeared in 1972-1973; in particular, most of UNIX was rewritten in C
in 1973, which is when it especially started getting wider use in BTL.
In 1973, C was better than most then-extant languages for real-time,
because it was higher than assembler, didn't require huge run-time
librarires or surprises (like behind-the-back garbage collection), was
fairly well-matched to existing hardware, and usable on small systems.
Research UNIX (Lab 127's base version) wasn't particularly targeted at
real-time, partly because that wasn't Ken's interest, and partly
because it's hard. It wasn't from lack of anticipation or knowledge.
If one reads The Bell System Technical Journal, July-August 1978, Vol.
57, No. 6, Part 2 (the first BSTJ UNIX issue), one finds that of the 20
articles, 6 describe real-time and related {extensions to UNIX,
UNIX-compatible OSs, applications thereof}. Many of the features
described therein either ended up in IEEE 1003, or evolved into such
features over time, and in fact, many were widely used inside Bell Labs
and the Bell System by the early 1980s, although it did happen that
some of those UNIX versions didn't get released outside.
Ken and Dennis certainly didn't anticipate all uses of C and UNIX that
would occur over the next 30 years, but the statement above is awfully
misleading., as many people contributed to UNIX in those days, and
thinking about real-time was publicly well-documented in the mid-1970s.
As UNIX spread, some companies made special efforts to do real-time
features. Masscomp was one example, and of course, anyone doing
real-time 3-D graphics had to think about this. [SGI IRIX starting
having serious real-time support in the mid/late 1980s, no justfor 3D
graphics but for other uses.]
That's the history, but it's all a red herring anyway.
The real issue is that there are various flavors of real-time, and one
must select approrpriate hardware, languages, and OS. The problems are
*hard* because they depend on worst-case behavior, not average
behavior, and this is more difficult as an OS handles caches, TLBs,
virtual-memory, multiple processors, long-running instructions, etc;
features good for average performance all too often make it more
difficult to give tighter bounds on worst-case performance.... but none
of this has very much to do with UNIX, and almost nothing to do with C.
The former comes in many flavors, of which some are actually pretty
decent for some flavors of real-time, and the latter has been used to
write many real-time systems. In some cases, the only way to get hard
real-time performance is to dedicate processors ... and the 1978 BSTJ
had several examples of that.
NOTE: Steve's feature that started this thread is NOT one that most
real-time people would ever want in their hardware, because, as
described, it could cause many instructions to have very long running
times with interrupts inhibited, since the design didn't have
restartability.
| Quote: | - It is no accident that it takes 2 pages to describe MVCL.
Indeed. And it increases the likelihood some implementation gets some
corner case wrong.
Laugh. Well there are always going to be people who don't fully read
the documentation as they should!
|
It isn't a question of just reading the manual.
People who write such microcode were/are serious professionals... and
it ill behooves amateurs to denigrate them.
The fundamental problem is:
some kinds of features are extremely difficult to simultaneously make
fast and simple, and the more aggressive the design, the more difficult
it gets.
It is hard work for experienced professionals to get things right. |
|
| Back to top |
|
 |
Jan Vorbrüggen
Guest
|
Posted:
Wed Jul 06, 2005 8:15 am Post subject:
Re: How does this make you feel? |
|
|
| Quote: | The fundamental problem is:
some kinds of features are extremely difficult to simultaneously make
fast and simple, and the more aggressive the design, the more difficult
it gets.
It is hard work for experienced professionals to get things right.
|
Indeed. Aggrevating this is the fact that the problems occur in places
that are extremely difficult to test, because they require interactions
to occur that are rare and that might depend on relative timing of events
(better known as race conditions). They sometimes work in simulation but
break in real hardware. Hey, making sure that a complex chain of events
does not contain race conditions or black holes already in its design is
highly non-trivial...here I usually remind people of the black hole in
the X25 protocol that went undetected for years.
Jan |
|
| Back to top |
|
 |
Jason Ozolins
Guest
|
Posted:
Wed Jul 06, 2005 2:36 pm Post subject:
Re: How does this make you feel? |
|
|
Steve wrote:
| Quote: | Jan Vorbrüggen wrote:
|
[ >>John Mashey wrote: ]
| Quote: | + MAybe later designers' insistence on measuring performance impacts
versus implementation costs caused them to ignroe potentially-wonderful
features whose only problem was that they needed a new OS and new
language to make use of them.
Rather, I'd think that smart programmers showed they could use RISC
primitives to implement, say, a memcpy just as efficiently as microcode
could, except perhaps for some of the cache effects (see below) and, of
course, at the expense of quite complicated code to handle all possible
alignments etc.
|
[small comment on Jan's post] Well, exactly. That's what John went on to
outline in later paragraphs.
| Quote: | But this is what computers are for: makeing the job easier for the
user. If it is possible to offload work from the programmer in such a
common case, why not?
|
Who is "the programmer?". For instance, you don't rewrite memcpy() if you're
an application programmer, you just call it. That makes the job way easy.
:-) If you say you want to allow efficient inlined memcpy, OK, that's the
compiler writer's job - again, an application programmer doesn't care.
It is unclear what your goals really are. So, if you are really serious
about making an improvement to the state of the art, ask yourself these
questions:
- who benefits?
- System programmers?
- application programmers?
- end users?
This is listed in *increasing* order of importance - think of the relative
numbers in each group. NOTE WELL: the end user doesn't care in the least
about computer architecture aesthetics. They either want a faster computer,
a cheaper computer, or some amazingly useful new feature.
- how much do they benefit?
I.E., is it worth anyone's while to change from what they're using now? If
so, will it still be the case by the time you get the product out the door?
To answer these questions, you need to:
- have a coherent description of your architecture
- understand how it would be implemented
- write enough code for it to be able to demonstrate its advantages
And to do that, you need to do stuff like:
- find out about existing architectures
- find out how they are implemented
- become skilled enough at implementation that you can understand how much a
feature will cost, in terms of silicon, speed, and time to market
- write a simulator for the architecture
- get together enough system software for the architecture (compiler, OS,
libraries) to demonstrate how it will achieve whatever your stated goal is.
If all this stuff sounds hard... it is. That's life. Unless you happen to
be a grade-A genius or incredibly lucky, the odds are that your idea is not
as world-changingly fabulous as you would like. I know mine haven't been. :-)
As a first step, read Hennessy and Patterson first. If you don't know what
that book is, Google is your friend.
| Quote: | This discussion has drifted a little off topic from the core of my
initial posting (which is natural for Usenet) and I feel that some of
you may have misread my intentions. Over the last couple of days I
have had some time to consider my position in light of your comments
and have decided that I obviously must make my case more explicit. So.
The idea that prompted my original post concerned a (possibly) new way
of constructing CPU registers. In my initial discussion, I suggested
that there might be a second pseudo-register that would modify the
behaviour of a conventionally conceieved register. Another poster
suggested implicitly that I could be talking about using a second
register to modify another. This point is not quite moot. I am
advocating a change in the way registers are conceived from a CPU
design standpoint.
I want to view a register as something that has mutable semantics. I
want it to apply to a general instruction set on some hypothetical
architecture in a meaningfull way. While it may make conventional
sense to modify the behavior of one register with another, as is seen
with various cannonical addressing modes today, this is not in line
with the philosophy that I am thinking about. I want you to think
about complexifying the way a register is implemented on-chip; if this
thought experiment logically indicates certain changes to the design of
a CPU or its ISA instrution set, that is part of the next step of
design. The specifics of its on-chip implementation are best left
unsaid at the moment.
|
Sorry to be blunt, but until you understand a bit more of the history of
computer architecture, you won't be able to appreciate what a bad idea it is
to design an architecture without regard to how it will be implemented.
| Quote: | As I am a complete amature at the CPU architecture game, I nevertheless
ask you to consider the potential benefits and liabilities to making
registers more intelligent. What does this hypothetical arrangement
potentially offer in terms of improvements over traditional register
and instruction set architecture?
|
How do you expect someone to ascertain this without doing the work that I
wrote about above? Is it reasonable to expect a detailed analysis of a very
complex proposal where you haven't tried to do that work yourself?
| Quote: | As a software programmer and language dabbler, it seems to me that the
division of labour is set in the wrong place along the programmer/CPU
line. I believe that making the CPU smarter will make for an
environment that is more useful to the programmer, but of course I
cannot prove it just yet. At the minimum, I think that a CPU that
handles complex registers such as I describe would exhibit more
flexibility and better performance than traditional solutions. [..]
|
This has been called "closing the semantic gap". It was a hot idea in the
1960s and '70s. Hint, hint.
| Quote: | [...] But
this is only my uninformed opinion.
|
Well, that's really the nub of the matter, isn't it? Look at what you wrote:
you are effectively stating what you need to do next, which is to become
informed. Folks here on comp.arch have a lot of knowledge to offer if you
really want to become informed. But that's different from chucking an idea
out there and expecting someone else to do the hard work.
-Jason (lurker on comp.arch since early '90s) |
|
| Back to top |
|
 |
Peter Grandi
Guest
|
Posted:
Sat Jul 09, 2005 4:15 pm Post subject:
Re: How does this make you feel? |
|
|
[ ... ]
old_systems_guy> Gresham's Law applied to unmoderated
old_systems_guy> newsgroups: Gresham's Law is "Bad money drives
old_systems_guy> good money out of circulation". [ ... ]
Moderated newsgroups are not that different -- moderation just
cuts off the ''bad manners'' postings, not the ''poor content''
ones, because what's the latter is highly opinable...
[ ... ]
old_systems_guy> If one reads The Bell System Technical Journal,
old_systems_guy> July-August 1978, Vol. 57, No. 6, Part 2 (the
old_systems_guy> first BSTJ UNIX issue), one finds that of the
old_systems_guy> 20 articles, 6 describe real-time and related
old_systems_guy> {extensions to UNIX, UNIX-compatible OSs,
old_systems_guy> applications thereof}.
As to this, I would like to single out one of them about MERT
aka RT/UNIX, by H. Lycklama&C., which was a hypervisor (on the
PDP-11!) running multiple OS personalities on top, one of which
was UNIX.
There have been several MERT style systems in the intervening
decades; I reckon that MS Windows NT, Mach or OSF/1, and Xen
(and others) resemble at least some aspect of MERT.
Indeed I suspect that one good path of evolution for Xen is to
become more like MERT...
[ ... ] |
|
| Back to top |
|
 |
Steve
Guest
|
Posted:
Sun Jul 10, 2005 5:51 am Post subject:
Re: How does this make you feel? |
|
|
[blame slow news propogation]
Jan Vorbrüggen wrote:
| Quote: | Rather, I'd think that smart programmers showed they could use RISC
primitives to implement, say, a memcpy just as efficiently as microcode
could, except perhaps for some of the cache effects (see below) and, of
course, at the expense of quite complicated code to handle all possible
alignments etc.
But this is what computers are for: makeing the job easier for the
user. If it is possible to offload work from the programmer in such a
common case, why not?
The work is offloaded from the programmer in any case - this type of code
is written "once" and then either called from a library or inlined by the
compiler (possibly taking advantage of the compiler's knowledge of lengths
and alignments in the process).
|
This is likely valid for VHDL (or Verilog) as well. Perhaps the
library macros need updating for new CPU products, but that's also true
for system libraries. So, what's better? memcpy() as a few tens of
in-line assembler instructions, or as a bunch of gates sitting on the
chip?
Let's assume we're talking about normal CPUs and not special products
for, say, embedded applications where die size, unit cost, and power
consumption might affect the economics related to the inclusion of
on-chip bit movers. Certainly a toaster has no need for a CPU with
luxuries of this kind, but a cell-phone might.
| Quote: | - It is no accident that it takes 2 pages to describe MVCL.
Indeed. And it increases the likelihood some implementation gets some
corner case wrong.
Laugh. Well there are always going to be people who don't fully read
the documentation as they should!
We are talking hardware implementation here, and bugs that cannot be fixed
by a microcode update. You've seen all the discussion of what implications
for virtual memory memory management such instructions have, plus inter-
actions with interrupts et al.
|
Yes, well, I was was being sarcastic. I'm very much in favour of a
design and programming philosphy that strives to make things work
properly before they are deployed in the field. I am aware of how
much work this can entail, and why it is rarely practised.
| Quote: | An instruction whose semantics force an implementation to execute it
twice, once doing nothing but checking for possible VM errors and once
again actually doing the work - and making sure that no page table has
been changed in the mean time, no interrupt has been handled, ... - is
broken by design. And designing such instruction such that they don't
have such broken semantics ain't easy.
|
I see no reason why you assume _a priori_ that such instructions would
have to be executed twice. However we have not specified the exact
semantics of a machine here, nor the bouds checking (for instance) that
might be necessary. My uneducated predisposition to disliking
microcode is one thing, but I suspect error checking that cannot be
made an integral part of the instruction decode and execution is bound
to fail to win over supporters.
Page faults interrupting an in-progress ranged operation might be
difficult to handle. If the hypothetical CPU supports concurrent
hardware threading, and the OS can guarantee not to stall in the CPU
while swapping under such conditions by virtue of the overall design,
then this is not a show stopper.
But I suspect there are all sorts of interesting ways this particular
problem could be handled. Perhaps the loading of register attributes
could optionally signal the CPU to inform the OS via an exception
callback that it should make sure the requisite pages are in memory or
on their way. Perhaps there is a simple and clever design of a
parallel instruction execution unit that will obviate potential stall
or deadlock risks, or other unanalysed design problems.
| Quote: | Incidentally, the transputer's MOVE instruction supplies a nice lesson in
why such designs are difficult. [war story elided]
Microcode updates are much more common these days, and so bugs in the
CPU are not quite the problem they once were.
Yes, in this example a fix could have been done in microcode. There are
similar bugs which cannot be fixed - and the code you field cannot _rely_
on the processor it is running on having been fixed in the first place,
so your software workarounds need to execute anyway!
|
True enough.
Speaking of microcode, prior art, and state of the art, I feel
obligated to say that the ISA microcomputer architectures seem rather
ugly. There are a bunch of things like policy driven bus arbitration ,
timers, hardware context switching and so forth that would be ideal as
small programmable functional units existing independently of the basic
CPU instruction architecture. I'm sure some of this stuff is part of
prior and existing designs, and I would be surprised to learn that
hardware channel arbitration wasn't done with microcode years and years
ago. IBM engineers have probably been doing some of this stuff for
ages. But like other features mentioned previously, some of this stuff
never made it down to the microprocessor world. It's a shame, too,
because I suspect it is _really_ difficult to do decent QOS
provisioning and real-time applications without non-trivial provisions
in the hardware aiding the OS.
With Linux and friends really taking off, I think there is a real need
to provide a more professional environment for those with big
applications. Twenty years from now it will probably be really
affordable to buy a non-trivial n-way parallel desktop machine. It
would suck if the only CPU options were direct decendents of the
unmodified microprocessor architectures of the 1980s CPUs.
Regards,
Steve |
|
| Back to top |
|
 |
Steve
Guest
|
Posted:
Sun Jul 10, 2005 6:08 am Post subject:
Re: How does this make you feel? |
|
|
Eric P. wrote:
| Quote: | Steve wrote:
snip
So, we might suggest 'R1' as a 64-bit register that may take on
additional properties that affect its application with an instruction.
The on-chip logic that accompanies the register includes provisions to
change its behavior (according to my previous suggestion). First,
there is a way to partition its bits into a address and length
component: specified perhaps by a special instruction that sets its
parameters:
rcfg r1, 54:8
... which gives us 54 bits of addressing and eight bits of length.
There has to be a third component, register 'chunk' size, that in
effect dictates the 'word' size of that particular register. In this
hypothetical design, different registers may have different partitions
and 'chunk' sizes. So, we might have a hypothetical register config
instruction like this:
This sounds vaguely like the Intel i432 (circa 1982) with its
bits-in-a-segment addressing. You might want to look at it.
Capability-Based Computer Systems
http://www.cs.washington.edu/homes/levy/capabook/
|
Interesting. Chapter two contains a description of partitioned
registers that is conceptually similar to what I am describing. I'll
have to make time to read the book.
| Quote: | At any rate, are you saying that if set to 54:9 I get a 9 bit machine?
And do all the internal ALUs and other function units reconfigure
to handle 9 bits now?
|
Not quite, and I suspect that would be rather difficult to do. (Say,
you people don't have the hardware equivalent of the IOCCC, do you?)
| Quote: | rcfg r1, 54:8:19
... which gives us a register configured to address 'chunks' of 2^19
bits (65536 bytes), with 54 bits of address space, and 8 bits of
'length' which allows in this case a span of 16M in 'steps' of 64k.
I don't get the point of these chuncks. Are chunks like pages?
If so this would seem to imply that you want page size under
user mode application control. (just trying to understand).
|
Register atributes and page size control are two different problem
domains, and the latter would likely be limited to a supervisor
privilage ring. Chunks are potentially varible ranges of memory words
specified by the value loaded into a register partition, and would be
specified by the application at run time. The feature would be a basic
part of the instruction semantics and syntax, and useful for a variety
of different machine language operations.
Regards,
Steve |
|
| Back to top |
|
 |
Steve
Guest
|
Posted:
Sun Jul 10, 2005 6:41 am Post subject:
Re: How does this make you feel? |
|
|
Jason Ozolins wrote:
| Quote: | Steve wrote:
But this is what computers are for: makeing the job easier for the
user. If it is possible to offload work from the programmer in such a
common case, why not?
Who is "the programmer?". For instance, you don't rewrite memcpy() if you're
an application programmer, you just call it. That makes the job way easy.
:-) If you say you want to allow efficient inlined memcpy, OK, that's the
compiler writer's job - again, an application programmer doesn't care.
It is unclear what your goals really are. So, if you are really serious
about making an improvement to the state of the art, ask yourself these
questions:
- who benefits?
- System programmers?
|
Yes.
| Quote: | - application programmers?
|
Yes.
Probably.
| Quote: | This is listed in *increasing* order of importance - think of the relative
numbers in each group. NOTE WELL: the end user doesn't care in the least
about computer architecture aesthetics. They either want a faster computer,
a cheaper computer, or some amazingly useful new feature.
|
I no longer make blase assumptions about what users want.
| Quote: | - how much do they benefit?
I.E., is it worth anyone's while to change from what they're using now? If
so, will it still be the case by the time you get the product out the door?
|
If the ISA extension that I propose is not fundamentally flawed, then
it would be proper to implement it for anyone who thinks it would make
their product more useful, efficient, powerful. I think register
attributes as I describe them (ignoring any other potential basic
functionality that might also make sense) are a candidate for industry
standardization. The case is not proven, and not falsified.
| Quote: | To answer these questions, you need to:
- have a coherent description of your architecture
|
Doable.
| Quote: | - understand how it would be implemented
|
Harder.
| Quote: | - write enough code for it to be able to demonstrate its advantages
|
Difficult.
| Quote: | And to do that, you need to do stuff like:
- find out about existing architectures
- find out how they are implemented
- become skilled enough at implementation that you can understand how much a
feature will cost, in terms of silicon, speed, and time to market
- write a simulator for the architecture
|
Trying to discourage me already, eh?
| Quote: | - get together enough system software for the architecture (compiler, OS,
libraries) to demonstrate how it will achieve whatever your stated goal is.
|
It may be possible to prove the economic case without recourse to
running a full-blown simulation.
| Quote: | If all this stuff sounds hard... it is. That's life. Unless you happen to
be a grade-A genius or incredibly lucky, the odds are that your idea is not
as world-changingly fabulous as you would like. I know mine haven't been. :-)
|
Yes... my previous world-changingly fabulous idea was along the lines
of encouraging people to get along more nicely. Sadly, it didn't work.
Nevertheless, past performance is no guarantee of future performance.
It may be that the ideas I have here are more meritous.
| Quote: | As a first step, read Hennessy and Patterson first. If you don't know what
that book is, Google is your friend.
|
Noted.
[snip]
| Quote: | As a software programmer and language dabbler, it seems to me that the
division of labour is set in the wrong place along the programmer/CPU
line. I believe that making the CPU smarter will make for an
environment that is more useful to the programmer, but of course I
cannot prove it just yet. At the minimum, I think that a CPU that
handles complex registers such as I describe would exhibit more
flexibility and better performance than traditional solutions. [..]
This has been called "closing the semantic gap". It was a hot idea in the
1960s and '70s. Hint, hint.
|
I might instead choose to use a library catalogue. Reading articles
online is sometimes a pain, and as you may know reading from a monitor
is rarely a good substitute for turning the page of a good hardcover
book.
| Quote: | [...] But
this is only my uninformed opinion.
Well, that's really the nub of the matter, isn't it? Look at what you wrote:
you are effectively stating what you need to do next, which is to become
informed. Folks here on comp.arch have a lot of knowledge to offer if you
really want to become informed. But that's different from chucking an idea
out there and expecting someone else to do the hard work.
|
I am not _expecting_ someone else to do the hard work. My expertise is
more in software than hardware, and as such I think I would be
compromising any goals I had in that regard if I were to stop
everything and become a hardware engineer simply because I thought of
something that seemed like it would be best done in hardware. I don't
have the faintest hope of putting together a simulator in anything like
a reasonable timeframe (read, less than five years). Other people may,
if they are convinced of the worth of the excercise. So, it would
probably be a good idea if I were to sit down and write out a more
complete description of the kind of hardware ISA that I envision; which
might entail hashing out the semantics of a hypothetical instrucion set
-- or at least much of it.
I have some reading to do now, thank you, and so with the exception of
this thread, I will go and lurk some more, and read and think about the
problem for a while.
Regards,
Steve |
|
| Back to top |
|
 |
Jan Vorbrüggen
Guest
|
Posted:
Mon Jul 11, 2005 8:15 am Post subject:
Re: How does this make you feel? |
|
|
| Quote: | This is likely valid for VHDL (or Verilog) as well. Perhaps the
library macros need updating for new CPU products, but that's also true
for system libraries. So, what's better? memcpy() as a few tens of
in-line assembler instructions, or as a bunch of gates sitting on the
chip?
|
The problem here is that the analogy doesn't work. Adding such a feature
to a processor, or modifying it, requires touching a dozen places or so
on the chip, and making sure afterwards that they still function correctly.
| Quote: | I see no reason why you assume _a priori_ that such instructions would
have to be executed twice.
|
Neither do I, but that wasn't the point - which was that unless you very
carefully consider such interactions in the specification of the semantics,
you can end up with semantics that force an implementation to behave in a
way that all your performance gain goes away, or worse.
| Quote: | But I suspect there are all sorts of interesting ways this particular
problem could be handled. Perhaps the loading of register attributes
could optionally signal the CPU to inform the OS via an exception
callback that it should make sure the requisite pages are in memory or
on their way.
|
Here you are already penalising the average case, to say nothing of the
worst case.
| Quote: | Perhaps there is a simple and clever design of a parallel instruction
execution unit that will obviate potential stall or deadlock risks, or
other unanalysed design problems.
|
Too many "perhaps" - it's nailing down the alternatives that distinguishes
a speculation from a design.
Jan |
|
| Back to top |
|
 |
glen herrmannsfeldt
Guest
|
Posted:
Mon Jul 11, 2005 4:15 pm Post subject:
Re: How does this make you feel? |
|
|
Steve wrote:
(snip)
| Quote: | If the ISA extension that I propose is not fundamentally flawed, then
it would be proper to implement it for anyone who thinks it would make
their product more useful, efficient, powerful. I think register
attributes as I describe them (ignoring any other potential basic
functionality that might also make sense) are a candidate for industry
standardization. The case is not proven, and not falsified.
|
(snip)
| Quote: | As a first step, read Hennessy and Patterson first. If you don't know what
that book is, Google is your friend.
|
You should definitely have that one, but in this case Blaauw & Brooks,
"Computer Architecture: Concepts and Evolution" might be better.
It has much more descriptions of past processors, the evolutionary tree
of computer architecture. I still don't understand the specific feature
being discussed, but it is likely discussed by Blaauw & Brooks.
You will be surprised at how many features existing in older machines
but that don't exist in more modern machines. Also, pretty often they
discuss which features shouldn't have been used in such machines.
(snip)
| Quote: | I am not _expecting_ someone else to do the hard work. My expertise is
more in software than hardware, and as such I think I would be
compromising any goals I had in that regard if I were to stop
everything and become a hardware engineer simply because I thought of
something that seemed like it would be best done in hardware. I don't
have the faintest hope of putting together a simulator in anything like
a reasonable timeframe (read, less than five years). Other people may,
if they are convinced of the worth of the excercise. So, it would
probably be a good idea if I were to sit down and write out a more
complete description of the kind of hardware ISA that I envision; which
might entail hashing out the semantics of a hypothetical instrucion set
-- or at least much of it.
|
Much more often than not what seems to be a useful hardware feature to
help software ends up making things worse. VAX has an instruction for
BOUNDS checking and for evaluating POLYnomials. Both are slower than
not using the special instruction, and so are not used in software.
| Quote: | I have some reading to do now, thank you, and so with the exception of
this thread, I will go and lurk some more, and read and think about the
problem for a while.
|
-- glen |
|
| Back to top |
|
 |
Guest
|
Posted:
Mon Jul 11, 2005 8:57 pm Post subject:
Re: How does this make you feel? |
|
|
"Steve" <steve49152@yahoo.ca> writes:
| Quote: | Jason Ozolins wrote:
It is unclear what your goals really are. So, if you are really
serious about making an improvement to the state of the art, ask
yourself these questions:
- who benefits?
- System programmers?
Yes.
- application programmers?
Yes.
- end users?
Probably.
This is listed in *increasing* order of importance - think of the
relative numbers in each group. NOTE WELL: the end user doesn't
care in the least about computer architecture aesthetics. They
either want a faster computer, a cheaper computer, or some
amazingly useful new feature.
|
No it is in inverse order, because problems at lower levels flow up to
bite the end user EVEN IF HE DOES NOT KNOW OR CARE. Pentium FP
anyone...
--
Paul Repacholi 1 Crescent Rd.,
+61 (08) 9257-1001 Kalamunda.
West Australia 6076
comp.os.vms,- The Older, Grumpier Slashdot
Raw, Cooked or Well-done, it's all half baked.
EPIC, The Architecture of the future, always has been, always will be. |
|
| Back to top |
|
 |
Jason Ozolins
Guest
|
Posted:
Tue Jul 12, 2005 6:27 am Post subject:
Re: How does this make you feel? |
|
|
prep@prep.synonet.com wrote:
| Quote: | Jason Ozolins wrote:
|
[listing system programmers, app programmers, end users]
| Quote: | This is listed in *increasing* order of importance - think of the
relative numbers in each group. NOTE WELL: the end user doesn't
care in the least about computer architecture aesthetics. They
either want a faster computer, a cheaper computer, or some
amazingly useful new feature.
No it is in inverse order, because problems at lower levels flow up to
bite the end user EVEN IF HE DOES NOT KNOW OR CARE. Pentium FP
anyone...
|
That's actually not a counterexample: The FDIV bug was nasty because there
was no easy way to isolate its effects from app programmers and end users.
The only ways to completely work around it would have been super-ugly, like
dynamically translating FP code to insert checks/fixups around FDIVs, or
disabling the FPU and interpreting each FP opcode in a trap handler.
Consider the "F00F" bug that crashes old Pentiums. There is a workaround for
it that can be done purely at the OS level. In fact, many of the errata
listed by Intel for their processors are handled by the folks writing the OS,
and you don't get to know about them. In those cases, a quite nasty bug for
the OS implementors is of no concern to the end user.
So:
Bugs that are visible all the way up to the end user: big impact.
Bugs that can be handled by apps programmers: some impact
Bugs that can be handled by system programmers without appreciable overhead:
who cares?
This fits with what I was asserting:
Features visible to end users (increased speed, new capabilities, better
reliability): big impact
Features that make life easier for apps programmers: some impact
Features that only get noticed if you're an OS programmer: who cares?
Just treat a bug as a "negative feature" and the relation holds.
Say you propose a new mechanism for handling page faults on some processor.
Let's say with this new mechanism the average code path through the fault
handler is reduced by 50%, but the fault handler execution time only drops by
1% because the big limiter is memory latency. This new feature might be
appreciated by system programmers writing a new OS, because their fault
handling code gets simpler, but will be pretty much invisible to the end
user. A feature that is only appreciated by systems programmers is of
limited commercial value.
By contrast, take the Altivec SIMD extensions for the PowerPC. Introducing
this feature makes *more work* for apps programmers - they have to write new
code to take advantage of the new instructions - but assuming that Altivec
doesn't slow down the rest of the processor, this is offset by the happiness
of users who get better media player performance or quicker response from
Photoshop, etc, and who pay money to get hold of that new feature (as well as
software upgrades). That feature is of much greater commercial value than
the better page fault handler feature.
My point to Steve in my posting was that to sell anyone on a new ISA, you
need to nail down enough specifics about the implementation to make some
predictions about who (if anyone) will actually see a reason to use a
processor based on that ISA. In doing so, the opinions of OS implementors
and apps programmers are usually heavily outweighed by the end users, who
want something like a smaller/faster/lighter gadget, or a faster word
processor/games box/web browser, and couldn't care less whether the processor
is elegantly powerful or a dog's breakfast that gets the job done. Some of
them (not so much the open source crowd) would also like all the binary-only
software that was installed on their last computer to work on their new
computer too. :-/
John Mashey has made the point that one of the few places a new ISA could get
traction these days is in embedded devices. Even there, aesthetics aren't
what matters, it's engineering: lower power implementations, reduced time to
market, lower manufacturing cost are all ways to sell a new processor
architecture. So Steve would still need to come up with a pragmatic
justification for his new ISA.
-Jason =:^) |
|
| Back to top |
|
 |
Terje Mathisen
Guest
|
Posted:
Wed Jul 13, 2005 12:23 am Post subject:
Re: How does this make you feel? |
|
|
Jason Ozolins wrote:
| Quote: | prep@prep.synonet.com wrote:
No it is in inverse order, because problems at lower levels flow up to
bite the end user EVEN IF HE DOES NOT KNOW OR CARE. Pentium FP
anyone...
That's actually not a counterexample: The FDIV bug was nasty because
there was no easy way to isolate its effects from app programmers and
end users. The only ways to completely work around it would have been
super-ugly, like dynamically translating FP code to insert checks/fixups
around FDIVs, or disabling the FPU and interpreting each FP opcode in a
trap handler.
|
This is more or less exactly what happened:
Asm code had to be patched manually, while all compiled code needed
recompilation with a compiler that had been modified to insert
detection/workaround code for each FDIV (as well as FPATAN) operation.
I wrote most of that workaround, in the end it resulted in exactly the
same double precision results as would have been generated by a
non-faulty chip, at a runtime overhead of about twice as slow FDIV
operations.
Since very few programs do nothing but FDIV, the actual overhead was
mostly in the 0 to 5% range.
| Quote: | Consider the "F00F" bug that crashes old Pentiums. There is a
workaround for it that can be done purely at the OS level. In fact,
many of the errata listed by Intel for their processors are handled by
the folks writing the OS, and you don't get to know about them. In
those cases, a quite nasty bug for the OS implementors is of no concern
to the end user.
|
F00F is quite neat, in that the only thing needed for a zero-overhead
fix was to carefully align an OS structure (along with some fixup code
that normally wouldn't be executed). :-)
| Quote: |
So:
Bugs that are visible all the way up to the end user: big impact.
Bugs that can be handled by apps programmers: some impact
Bugs that can be handled by system programmers without appreciable
overhead: who cares?
|
I agree.
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching" |
|
| Back to top |
|
 |
glen herrmannsfeldt
Guest
|
Posted:
Wed Jul 13, 2005 11:01 pm Post subject:
Re: How does this make you feel? |
|
|
Terje Mathisen wrote:
(snip)
| Quote: | Consider the "F00F" bug that crashes old Pentiums. There is a
workaround for it that can be done purely at the OS level. In fact,
many of the errata listed by Intel for their processors are handled by
the folks writing the OS, and you don't get to know about them. In
those cases, a quite nasty bug for the OS implementors is of no concern
to the end user.
F00F is quite neat, in that the only thing needed for a zero-overhead
fix was to carefully align an OS structure (along with some fixup code
that normally wouldn't be executed). :-)
|
I have FreeBSD running on a pentium, and when it boots it always says:
Installing workaround for F00F bug.
-- glen |
|
| Back to top |
|
 |
|
|
|
|