Eliot Miranda
Guest
|
Posted:
Thu Nov 03, 2005 1:15 am Post subject:
Re: tagged/boxed arith op frequencies [was interrupting for |
|
|
Terje Mathisen wrote:
| Quote: | Eliot Miranda wrote:
datatype # ops % arith ops % total sends
Float 408 0.014% 0.002%
LargePositiveInteger 1712 0.059% 0.009%
SmallInteger 2907178 99.927% 14.727%
disclaimer>These data are not necessarily representative of any
specific Smalltalk application beyond the VisualWorks IDE
itself.</disclaimer
So about 15% of all sends are arithmetic primitives. In
integer-intensive code about 99.9% of integer operations are on tagged
integers, and about 90% are additions.
Thanks!
There is nothing like a set of measurements when you've been assuming
for a while. :-)
So, according to those stats, being able to handle 30-bit integer adds
with the shortest possible inline code is a big win, even at quite
substantial overhead for LargeInteger operations.
The most important observation would seem to be that a SmallTalk
compiler needs to do the same kinds of optimization as a Java JIT, in
particular it should get rid of _all_ the overhead for all loop indices,
doing the checking and overflow handling outside the inner loop.
By reserving the all-zero tag for SmallInt, you can then compile most of
the code without ever even checking if overflows have happened:
for (i = 0; i < limit; i++) {
array[i] = ...
}
should compile into something like this:
... Check that 'limit' is in range of SmallInt and array[]
; i stored in EDI, limit in ESI
next:
... ; calculation, result in EAX
mov array[EDI],EAX
add EDI,4 ; Increment tagged integer index
cmp EDI,ESI
jl next
I.e. identical code to what a C compiler would generate for the same
loop. The two-bit zero tag means that the tagged integer effectively
works as a scaled index, simplifying addressing further.
|
Right. And mov array[EDI-3],EAX would allow one to use non-zero tags
too. But the issue here is in proving that limit is within the 30-bit
integer range. The code would more realistically be written
1 to: anArray size do:
[:i|
anArray at: i put: ...]
and an adaptive optimizer would attempt to optimize the expression once
it had been executed a fair few times. At that point the run-time
system could deliver the set of classes that had received the size
message because the set would be encoded in the polymorphic inline cache
for the send of size. Let's imagine the cache only contained one entry,
the class Array. Then the compiler could know that for Array, the size
primitive must return an integer from 0 to some maximum size withion the
SmallInteger range, and hence that the type of i was a SmallInteger in
bounds. Then propagating this type information to thge dereference it
could generate the optimised code.
See Urs Hölzle's thesis "Adaptive Optimization for Self: Reconciling
High Performance with Exploratory Programming", available on the web as
http://www.cs.ucsb.edu/labs/oocsb/papers/urs-thesis.html
http://www.cs.ucsb.edu/labs/oocsb/papers/hoelzle-thesis.pdf
http://www.cs.ucsb.edu/labs/oocsb/papers/hoelzle-thesis.ps.Z
--
_______________,,,^..^,,,____________________________
Eliot Miranda Smalltalk - Scene not herd |
|
Eliot Miranda
Guest
|
Posted:
Thu Nov 03, 2005 7:34 am Post subject:
Re: tagged/boxed arith op frequencies [was interrupting for |
|
|
Terje Mathisen wrote:
| Quote: | Eliot Miranda wrote:
Terje Mathisen wrote:
I.e. identical code to what a C compiler would generate for the same
loop. The two-bit zero tag means that the tagged integer effectively
works as a scaled index, simplifying addressing further.
Right. And mov array[EDI-3],EAX would allow one to use non-zero tags
I would still prefer a zero tag for SmallInt since it is responsible for
such a large percentage of all operations:
|
Except that the data shows that sends to ordinary objects far outweigh
operations on SmallIntegers, hence using 0 for ordinary objects may have
an advantage.
| Quote: |
This would seem to simplify a few other operations, since you don't have
to handle the issue of tag bits wrapping into the data itself.
OTOH, I guess you could fix this in most (all?) situations with
carefully chosen bias handling...
ADD -> sum = a + b - tag -> lea esi,[eax+ebx-tag]
SUB -> sum = a - b + tag -> lea esi,[eax+tag] / sub esi,ebx
|
Exactly
| Quote: | Multiplication and division seems a lot harder though, you'd have to get
rid of the tags first, do the operation, rescale and replace the tag.
Is that the best way to handle it, or is ther some clever trick that I
didn't notice?
|
Not that I know of. Zero tags allow one to shift only one operand on
multiplication:
4 * (a * b) == (a * 4 * b)
but this doesn't work for rounding division
--
_______________,,,^..^,,,____________________________
Eliot Miranda Smalltalk - Scene not herd |
|