| Author |
Message |
Andrew Reilly
Guest
|
Posted:
Sat Aug 13, 2005 7:24 am Post subject:
Re: Code density and performance? |
|
|
On Fri, 12 Aug 2005 17:02:27 +0100, Peter Grandi wrote:
| Quote: | On 9 Aug 2005 08:21:07 GMT, nmm1@cus.cam.ac.uk (Nick
Maclaren) said:
nmm1> you need an application that uses significantly more
nmm1> virtual memory than physical, accesses it in a way that
nmm1> is not amenable to streaming I/O,
Most desktop apps? Web browsers for example?
|
I thinik that it behooves you to give more information about the situation
that you find yourself heavily swapping in. Anecdotally (on the FreeBSD
and NetBSD mailing lists, it's not a situation that arises often.) As I
said in a previous post, I have one NetBSD system configured with no swap
at all, and that uses the smallest amount of memory that I could buy (256M).
My main server/workstation is a P3 with 512M of RAM. It runs NFS, IMAP,
SMTP, DHCP, DNS and a bunch of other services for my home lan. It also
runs X11 and GNOME2. Yesterday I was using portupgrade to rebuild my
ports collection, so use was pegged at a bit over 1.0. I was also
browsing the web with epiphany while watching the compilation and system
state/progress in a few xterms. The system almost never swapped. I have
2G of swap configured (I was trained in swap configuration in the days
before overcommit...) and pstat -s showed about 100k of swap space used: 0%.
I simply don't think that you've made a convincing argument, or perhaps
the paging and disk cache algorithms aren't as well tuned on Linux as they
are on FreeBSD. Certainly my experience of Windows NT in the past was
that any operation on sufficiently large files would cause all of the
program text to be paged out, which rapidly resulted in thrashing. That's
NOT a fault of too-large a page size. That's the fault of misuse of
memory mapping or poor disk cache tuning.
Yes, what you say about fragmentation, page size and the reduction of
working set size is obviously true, BUT I still don't believe that there
are many situations in today's hardware environment where it matters
diddly.
| Quote: | Sure, paging sucks, and one does it only because of money, but the
argument I am making is that with large pages it sucks even more.
|
Yes, that's your argument. You call the situation that I'm experiencing
"infinite memory", because my systems don't swap, by and large. They are,
therefore, not interesting.
It's just hard for me to see how they could be smaller.
--
Andrew |
|
| Back to top |
|
 |
Andrew Reilly
Guest
|
Posted:
Sat Aug 13, 2005 7:41 am Post subject:
Re: Code density and performance? |
|
|
On Fri, 12 Aug 2005 17:46:57 +0100, Peter Grandi wrote:
| Quote: | http://live.GNOME.org/MemoryReduction
«The plan is to reduce the amount of memory that Gnome
applications consume. Gnome is barely usable on a machine
with 128 MB of RAM;»
|
Most of that document describes a "fault" of the malloc/free system that
Gnome is using on Linux. A program's heap apparently grows monotonically:
freed memory is never handed back to the system. The suggestion is made
that the allocator should be re-implemented on top of mmap, rather than
sbrk.
FreeBSD for certain, and I'd guess the other BSDs too, have been using
just such an allocator for many years. Perhaps this is a significant
difference between the performance that we're experiencing. As such, it's
not really a GNOME problem, but a Linux one.
[Note that an allocator that hands freed memory back to the system
technically violates the C spec, which stipulates that a program that
frees memory should be able to expect that it can subsequently malloc that
back without having malloc fail for an out-of-memory condition. That
could happen in the presence of hand-back and VM overcommit, because some
other process will get those pages. On the other hand, malloc never fails
on a system with VM overcommit. You just run the risk of random processes
being killed to satisfy demand-access to allocated space...]
--
Andrew |
|
| Back to top |
|
 |
John Savard
Guest
|
Posted:
Sat Aug 13, 2005 8:15 am Post subject:
Re: Silly new instructions |
|
|
On Wed, 10 Aug 2005 20:35:17 +0200, "Peter \"Firefly\" Lund"
<firefly@diku.dk> wrote, in part:
| Quote: | Inspired by the argument from the German-speaking camp
that having the SP be a general-purpose register was a good thing
|
Having registers like the SP and even the PC be general-purpose
registers is a good thing for one reason - if losing a few GPRs is
compensated for by not having to fill the opcode space with a bunch of
duplicate instructins, and make decoding them more complicated.
John Savard
http://www.quadibloc.com/index.html
_________________________________________
Usenet Zone Free Binaries Usenet Server
More than 120,000 groups
Unlimited download
http://www.usenetzone.com to open account |
|
| Back to top |
|
 |
Andrew Reilly
Guest
|
Posted:
Sat Aug 13, 2005 8:15 am Post subject:
Re: Code density and performance? |
|
|
On Sat, 13 Aug 2005 12:41:23 +1000, Andrew Reilly wrote:
| Quote: | On the other hand, malloc never fails
on a system with VM overcommit. You just run the risk of random processes
being killed to satisfy demand-access to allocated space...
|
Oops: not so hasty. Process resource quotas are an excellent way to get
malloc() to fail, and are highly recommended on VM overcommit systems.
--
Andrew |
|
| Back to top |
|
 |
Anton Ertl
Guest
|
Posted:
Sat Aug 13, 2005 8:15 am Post subject:
Re: Code density and performance? |
|
|
Andrew Reilly <andrew-newspost@areilly.bpc-users.org> writes:
| Quote: | [Note that an allocator that hands freed memory back to the system
technically violates the C spec, which stipulates that a program that
frees memory should be able to expect that it can subsequently malloc that
back without having malloc fail for an out-of-memory condition. That
could happen in the presence of hand-back and VM overcommit, because some
other process will get those pages. On the other hand, malloc never fails
on a system with VM overcommit.
|
Mmap and thus malloc fails with overcommit, when you ask for a chunk
larger than the largest remaining hole in the address space. However,
with proper overcommit unmapping an area followed by an mmap of the
same size should succeed (hmm, I guess there could be a problem if the
stack grows in the meantime).
OTOH, without proper overcommit, you can get the problem you
mentioned: One process frees some commited memory, another process
requests memory, and the OS commits it, and now there is not enough
commitable memory for the first process.
As for Linux, it has traditionally combined the disadvantages of
overcommitting (out-of-memory kills) and not overcommitting (you
cannot reliably allocate large pieces of address space) in its default
setting. Fortunately, "echo 1 >/proc/sys/vm/overcommit_memory" works.
- anton
--
M. Anton Ertl Some things have to be seen to be believed
anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html |
|
| Back to top |
|
 |
Peter \"Firefly\" Lund
Guest
|
Posted:
Sat Aug 13, 2005 2:16 pm Post subject:
Re: Silly new instructions |
|
|
On Sat, 13 Aug 2005, John Savard wrote:
| Quote: | Having registers like the SP and even the PC be general-purpose
registers is a good thing for one reason - if losing a few GPRs is
compensated for by not having to fill the opcode space with a bunch of
duplicate instructins,
|
That's not a good argument. You advocate wasting the opcode space with
useless operations on the SP and PC instead.
| Quote: | and make decoding them more complicated.
|
The PC is usually special-cased in hardware anyway so I don't buy your
second argument either. There's also often some special-casing related to
the SP.
-Peter |
|
| Back to top |
|
 |
Dan Koren
Guest
|
Posted:
Sat Aug 13, 2005 2:53 pm Post subject:
Re: PART 3. Why it seems difficult to make an OOO VAX compet |
|
|
"Nick Maclaren" <nmm1@cus.cam.ac.uk> wrote in message
news:ddafe4$j5j$1@gemini.csx.cam.ac.uk...
| Quote: |
In article <Pine.LNX.4.61.0508091631400.9761@tyr.diku.dk>,
"Peter \"Firefly\" Lund" <firefly@diku.dk> writes:
|
|> > You don't HAVE a separate TLB entry. The correct solution is to put
the
|
|> You don't like paging much, do you? ;)
You noticed?
|
Reminds me of the following episode:
Shortly after Cray Computer Corporation went out of
business in March 1995 I interviewed a middle aged
fellow who had worked there and whose title had
been "(principal?) software architect". I do not
recall his name (and would not disclose it if I
did).
I asked for a brief overview of the system's
architecture. The conversation wasn't flowing and
I felt I wasn't getting from him much information
from which to form an opinion, so I started to ask
more focused questions.
DK: "How would you describe the cache architecture?"
SA: "There are no caches."
After an awkward silence lasting a few seconds:
SA: "Seymour does not like caches."
A few questions later:
DK: "Please describe the virtual memory architecture."
SA: "The system does not have virtual memory."
Again, after a pregnant silence:
SA: "Seymour does not like virtual memory."
I found it rather amusing that "Seymour does not like X"
was the only explanation offered for key aspects of the
system's architecture. One could have thought of a few
other reasons ;-)
dk |
|
| Back to top |
|
 |
FredK
Guest
|
Posted:
Sat Aug 13, 2005 4:15 pm Post subject:
Re: PART 3. Why it seems difficult to make an OOO VAX compet |
|
|
"Dan Koren" <dankoren@yahoo.com> wrote in message
news:42fdc30c$1@news.meer.net...
| Quote: |
....
I found it rather amusing that "Seymour does not like X"
was the only explanation offered for key aspects of the
system's architecture. One could have thought of a few
other reasons ;-)
|
We have a number of conference rooms named after famous computer pioneers, I
believe the Cray CR has this quote on the wall from Seymour... "Parity is
for farmers." |
|
| Back to top |
|
 |
Tom Linden
Guest
|
Posted:
Sat Aug 13, 2005 4:15 pm Post subject:
Re: PART 3. Why it seems difficult to make an OOO VAX compet |
|
|
On Fri, 12 Aug 2005 17:15:10 -0400, Eric P.
<eric_pattison@sympaticoREMOVE.ca> wrote:
| Quote: | John Mashey" <old_systems_guy@yahoo.com> writes:
But the bottom line is: the VAX ISA was very difficult to keep
competitive. The obvious decoding complexity is always there, in one
form or another, but the more serious problem is execution complexity
that lessens effective ILP and is thus a continual drag on performance
with reasonable implementations.
In case anyone is still interested in this topic,
there are a bunch of papers by Bob Supnik at
http://simh.trailing-edge.com/papers.html
covering a variety of DEc design issues.
The one labeled "VLSI VAX Micro-Architecture" is from 1988
(marked "For Internal Use Only, Semiconductor Engineering Group")
mentions at the end the ways a VAX might get lower CPI. It says
"However the VAX architecture is highly resistant to macro-level
parallelism:
– Variable length specifiers make parallel decoding of specifiers
difficult and expensive
– Interlocks within and between instructions make overlap of
specifiers with instruction execution difficult and expensive
|
Expensive but not unfeasable. Remember, the number of transistors you can
put on a die today is one hundrefold greater thasn 1988.
| Quote: |
Most (but not all) VAX architects feel that the costs of macro-level
parallelism outweighs the benefits; hence this approach is
not being actively pursued."
So it would seem that the designers felt at that time that decode
was a major impediment.
Eric
|
|
|
| Back to top |
|
 |
Dan Koren
Guest
|
Posted:
Sun Aug 14, 2005 12:15 am Post subject:
Re: Silly new instructions |
|
|
"John Ahlstrom" <ahlstromjk@comcast.net> wrote in message
news:oYudna_EtM1W0mPfRVn-tw@comcast.com...
| Quote: | John Savard wrote:
On Wed, 10 Aug 2005 20:35:17 +0200, "Peter \"Firefly\" Lund"
firefly@diku.dk> wrote, in part:
Inspired by the argument from the German-speaking camp that having the
SP be a general-purpose register was a good thing
Having registers like the SP and even the PC be general-purpose
registers is a good thing for one reason - if losing a few GPRs is
compensated for by not having to fill the opcode space with a bunch of
duplicate instructins, and make decoding them more complicated.
John Savard
http://www.quadibloc.com/index.html
I believe using a GPR as SP or PC (or both) was also patented
thus preventing PDP-11 knockoffs. A very considerable benefit.
|
Pretending to act as the devil's advocate:
One could also argue the case that having the SP
and the PC (almost) invisible and inaccessible to
(user mode) software could bring very considerable
benefits.
Just a thought ;-)
dk |
|
| Back to top |
|
 |
Bernd Paysan
Guest
|
Posted:
Sun Aug 14, 2005 12:15 am Post subject:
Re: Code density and performance? |
|
|
Andrew Reilly wrote:
| Quote: | Most of that document describes a "fault" of the malloc/free system that
Gnome is using on Linux. A program's heap apparently grows monotonically:
freed memory is never handed back to the system. The suggestion is made
that the allocator should be re-implemented on top of mmap, rather than
sbrk.
|
I do this in bigFORTH. The string pool also has a compactor which lets the
holes bubble up, and when finished, the freed area is released to the
system with munmap (and instantly remapped anonymously again).
--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/ |
|
| Back to top |
|
 |
John Ahlstrom
Guest
|
Posted:
Sun Aug 14, 2005 12:15 am Post subject:
Re: Silly new instructions |
|
|
John Savard wrote:
| Quote: | On Wed, 10 Aug 2005 20:35:17 +0200, "Peter \"Firefly\" Lund"
firefly@diku.dk> wrote, in part:
Inspired by the argument from the German-speaking camp
that having the SP be a general-purpose register was a good thing
Having registers like the SP and even the PC be general-purpose
registers is a good thing for one reason - if losing a few GPRs is
compensated for by not having to fill the opcode space with a bunch of
duplicate instructins, and make decoding them more complicated.
John Savard
http://www.quadibloc.com/index.html
|
I believe using a GPR as SP or PC (or both) was also patented
thus preventing PDP-11 knockoffs. A very considerable benefit.
JKA
--
"If you can't drink their booze, take their money, and then vote
against them anyway, you don't belong in this game."
L O'Donnell, Jr |
|
| Back to top |
|
 |
Peter \"Firefly\" Lund
Guest
|
Posted:
Sun Aug 14, 2005 1:43 pm Post subject:
Re: Silly new instructions |
|
|
On Sat, 13 Aug 2005, Dan Koren wrote:
| Quote: | One could also argue the case that having the SP
and the PC (almost) invisible and inaccessible to
(user mode) software could bring very considerable
benefits.
|
So you could load/store stuff relative to the SP (and possibly relative to
a base pointer) but you couldn't get the effective address of that stuff?
That wouldn't play well with automatic arrays in C, for example, as they
are typically implemented.
| Quote: | Just a thought ;-)
|
The idea being that the protection mechanisms would be simpler?
What about other pointers?
-Peter |
|
| Back to top |
|
 |
John Savard
Guest
|
Posted:
Sun Aug 14, 2005 4:15 pm Post subject:
Re: Silly new instructions |
|
|
On Sat, 13 Aug 2005 12:53:48 -0700, John Ahlstrom
<ahlstromjk@comcast.net> wrote, in part:
| Quote: | I believe using a GPR as SP or PC (or both) was also patented
thus preventing PDP-11 knockoffs. A very considerable benefit.
|
It was the UNIBUS patents that did in DCC...
John Savard
http://www.quadibloc.com/index.html
_________________________________________
Usenet Zone Free Binaries Usenet Server
More than 120,000 groups
Unlimited download
http://www.usenetzone.com to open account |
|
| Back to top |
|
 |
John Savard
Guest
|
Posted:
Sun Aug 14, 2005 4:15 pm Post subject:
Re: Silly new instructions |
|
|
On Sat, 13 Aug 2005 11:16:54 +0200, "Peter \"Firefly\" Lund"
<firefly@diku.dk> wrote, in part:
| Quote: | That's not a good argument. You advocate wasting the opcode space with
useless operations on the SP and PC instead.
|
I don't advocate making the SP and PC general registers, I was just
pointing out that the arguments in the post I replied to were worse than
the only one I felt had any merit.
In my own take on a computer architecture,
http://www.quadibloc.com/arch/arcint.htm
the PC is not a general register, and I don't even *have* a stack
pointer. Although, if one must have a stack, one can use general
registers to point to it...
actually, in some instruction modes, base register zero, not used as a
base register (zero in the field indicates absolute addressing, as with
the 360) can serve as a stack pointer, because it happens to be lying
around otherwise not good for much.
Of course, interrupts are a little awkward...
John Savard
http://www.quadibloc.com/index.html
_________________________________________
Usenet Zone Free Binaries Usenet Server
More than 120,000 groups
Unlimited download
http://www.usenetzone.com to open account |
|
| Back to top |
|
 |
|
|
|
|