A (Hypothetical) Example of Code Density in CISC
CASTalk.com Forum Index CASTalk.com
Discussion of DSP, FPGA, storage and embedded system.
 
 FAQFAQ   MemberlistMemberlist     RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 
Google
 
Web castalk.com
A (Hypothetical) Example of Code Density in CISC

 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture
Author Message
Guest






Posted: Thu Jul 21, 2005 12:15 am    Post subject: A (Hypothetical) Example of Code Density in CISC Reply with quote

Currently, I am using Google Groups to access this newsgroup. As a
result, I keep encountering the same recent batch of racist postings to
this newsgroup on the first page.

Given a topic that has generated quite a thread, the idea of modifying
RISC instructions for higher density, I thought it would not be that
bad under the circumstances to mention my web site again.

The IBM 360 instruction set looked like this:

********DDDDSSSS
********DDDDXXXX BBBBaaaaaaaaaaaa

* = opcode, D = destination register, S = source register, X = index
register, B = base register, a = address (or displacement).

The Motorola 68020 instruction set looked like this:

****DDD***...SSS
****DDD***...BBB aaaaaaaaaaaaaaaa
****DDD***...BBB .XXX....aaaaaaaa
****DDD***...BBB .XXX............ aaaaaaaaaaaaaaaa

where . = additional overhead or mode bits.

The format in the fourth line was added with the 68020. If one wished
to use an index register with an address, the displacement shrank on
the 68000. (The Philco 2000 apparently had this bizarre characteristic
as well.)

On my web pages, I tried to combine the positive features of both
architectures to achieve high code density, with the basic scheme:

*******DDDSSS...
*******DDDXXXBBB aaaaaaaaaaaaaaaa

to have something like the 360, but by switching to groups of eight
registers instead of groups of sixteen registers, to allow a more
comfortable 64 K size for the area to which a base register points,
instead of 4 K. (I follow the 360 in making the displacement unsigned;
as it is signed on the 68000, the original Mac limited routines to 32K
in size, using only the part after the address the base register points
to.)

The page in question is at

http://www.quadibloc.com/arch/arcint.htm

However, this wasn't quite enough for me. One way or another, I wanted
to cram more operations into a given stretch of RAM.

I used seven-bit opcodes to provide access to a wide range of basic
data types, including 64-bit integers and 128-bit floating-point
numbers. If I restricted myself to the three fixed-point types and two
floating types of the original IBM 360, I could use a six-bit opcode,
allowing me to use one bit in the instruction to switch to some
additional special denser instruction format.

( http://www.quadibloc.com/arch/ar0302.htm )

I did this in various ways. I found that the shift instruction, which
was 32 bits long, could be reduced to 16 bits in length, if it was
allowed to use as much opcode space as _all_ the operate instructions.
So one option was to use additional opcode space to make shift
instructions shorter.

Incidentally, the basic architecture as presented, although having many
admirable characteristics, is hardly suitable for implementation _as
is_. Trying different ways to achieve higher code density is one thing;
*leaving them all in*, so that one can choose between fifty-eight
different modes of operation for the computer leads, of course, to
needless complexity in instruction decoding. In real life, for an
architecture based on this, one might settle on *two* modes; a dense
mode for ordinary programming, and another mode allowing access to
Cray-like vector instructions. Perhaps a third mode for stack
programming, or three-address programming, or greater code density with
some tradeoffs might be included - if one was indulgent.

Instead of short shift instructions, the next choice offered on my
pages is a mode that includes two register-to-register instructions in
a single 16-bit halfword. The type is specified in advance, remaining
constant over several instructions, so the opcode is four bits long.
The destination register is always register zero, so the only other
field in the instruction is a three-bit field indicating the source
register.

( http://www.quadibloc.com/arch/ar0304.htm )

One frustrating thing about the 360-like basic format I had used was
that the index register field is usually zero most of the time, to
indicate that no indexing is taking place; this meant that almost three
bits were wasted in memory-reference instructions.

In order to recover three bits, I did have to make some sacrifices. So,
I came up with one mode in which the opcode grew to eight bits, but
indexed memory reference instructions could have only registers zero
and one as their destinations - and only three registers could be index
registers. (Setting the index register to zero indicated indirect
addressing, since mode bits indicated conventional memory reference
instructions which could have all eight destination registers.)

The formats in that mode were

...********DDDSSS
...********DDDBBB aaaaaaaaaaaaaaaa
...********DXXBBB aaaaaaaaaaaaaaaa

( http://www.quadibloc.com/arch/ar0305.htm )

We then proceed to the page with my earliest attempt at a more
condensed mode, the scratchpad mode. Inspired by the PDP-8 and the
Honeywell 316, I felt it ought to be possible to squeeze a
memory-reference instruction (of a sort) instead of just a
register-to-register instruction into a single 16-bit halfword.

Originally, to do this I modified the operate instructions extensively
to cut the opcode space they used in half, but later I found a way to
avoid this. The scratchpad modes now, in general, allow only four
possible destination registers.

One family allows the use of scratchpad areas in memory with 64
entries, with the instruction formats

...DD******SSS...
...DD******XXXBBB aaaaaaaaaaaa
...DD******ssssss

where s = scratchpad source, and another family increased the size of
the scratchpads to 256 elements by shrinking the opcodes of scratchpad
instructions only from six bits to four. As with the two-instruction
per word register to register format above, this depends on setting the
type of instruction operands in advance, and it produces an instruction
that resembles the single word memory-reference instructions of early
16-bit minicomputers. (Unlike later 16-bit minicomputers such as the
PDP-11, of course, that used two words if recourse to memory was
required.)

On that page, I have added my very latest attempt to achieve very
compact code, the Simple Compact Mode, which combines several of the
techniques seen thus far. The instruction formats in that mode are

.....******DDDSSS
.....******DDDBBB aaaaaaaaaaaa
....DDXX******BBB aaaaaaaaaaaa
...DD******ssssss

allowing an indexed memory reference instruction with four possible
destination registers and three possible index registers, and in return
gaining enough opcode space to permit both one type of scratchpad
instruction and a 16-bit shift instruction.

( http://www.quadibloc.com/arch/ar0306.htm )

Previously, to combine short shift instructions and scratchpad
instructions, I resorted to a more extreme expedient to gain opcode
space. Since the IBM 360 seemed to manage with having its base
registers point only to 4K-byte regions, I moved the base register
field from the first 16 bits of the instruction into the subsequent
halfword containing the address.

Combining this with the various techniques elaborated upon in earlier
pages yielded some modes which combined access to a wide range of
features with the ability to code some operations quite compactly.

( http://www.quadibloc.com/arch/ar0307.htm )

Another very early thing I attempted in the development of the
architecture was to provide a mode in which a 16-bit word could contain
three five-bit opcodes which indicated stack operations.

Under ideal circumstances, this could provide the _ne plus ultra_ in
code density.

Thus, again, while providing such a bewildering variety of choice to
programmers on one system is clearly wasteful of transistors, the
description may be a fruitful source of ideas for people searching for
ways to compress code.

Oh, yes: as noted in the original thread, on the page

http://www.quadibloc.com/arch/ar0507.htm

I take a look at the proposal by Heidi Pan which was mentioned by the
original poster to that thread.

John Savard
Back to top
Guest






Posted: Thu Jul 21, 2005 4:15 pm    Post subject: Re: A (Hypothetical) Example of Code Density in CISC Reply with quote

I wrote:
Quote:
Given a topic that has generated quite a thread, the idea of modifying
RISC instructions for higher density, I thought it would not be that
bad under the circumstances to mention my web site again.

I've added one more thing.

Having two major kinds of vector instructions, MMX-like ones and
Cray-like ones, the spectacle of enormous banks of registers 128 bits
wide which might contain 32 bit long or 64 bit long floating-point
numbers seemed wasteful.

So I've added the RMOI instruction prefix, described on the page

http://www.quadibloc.com/arch/ar0203.htm

This changes instructions dealing with 32-bit floating point quantities
into instructions dealing with 128-bit vectors of four 32-bit floating
point quantities, and instructions dealing with 64-bit floating point
quantities into instructions dealing with 128-bit vectors of two 64-bit
floating point quantities.

Unlike the short vector instructions, which deal with 256-bit vectors,
here the ordinary arithmetic units are used, and it is explicitly
stated that the components will not normally be operated on in
parallel.

John Savard
Back to top
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture All times are GMT
Page 1 of 1

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




VoIP Electronics Powered by phpBB