| Author |
Message |
Ray Andraka
Guest
|
Posted:
Thu Dec 22, 2005 9:15 am Post subject:
Re: Place and Route Algorithms: where's the fat? |
|
|
Austin Lesea wrote:
| Quote: |
It is true that "significant gains" are still (often, not always)
realized by some careful hand placement, and some careful partitioning.
That suggests that the design languages lack something important, as the
intent of the designer is not being communicated to the tools.
.... |
| Quote: | The software folks here at Xilinx are amazing: they have managed to
improve every generation the performance while reducing the time to
compile the designs; all the while we in IC design follow Moore's law
and double the density. Not to mention we add more custom IP, and
customers are getting more demanding.
|
Austin,
What is missing is geographic relationships between parts of the
circuit. Perhaps the biggest piece missing in the current tools is
utilization of the hierarchy in a design. The current xilinx tools
flatten the design before they even start on the place and route
problem, and that greatly increases the workload and time to complete
while also degrading performance. The tools have an opportunity to use
the hierarchy in the design to treat each hierarchical layer as a
mini-design, essentially breaking the problem into smaller problems in a
way that is consistent with the way the designer broke up the design.
Going to a true hierarchical place and route would improve both the
quality of results as well as the run times.
Now, I do disagree with your assertion that each generation of the tools
improves both run time and quality of results. I have indeed seen
improvements in run time, but more often than not the quality of results
has taken a step backwards with each major release of ISE. Yes, I
suppose for flat RTL only designs, the results have gotten somewhat
better, but that is mostly due to large improvements in synthesis, and
small incremental improvements in the automatic placement (which BTW,
still does a dismal job with placing non-LUT resources, with placing
data paths, and with placing logic that has multiple levels of LUTs
between FFs). In the mean time, the routing algorithms have gotten
lazy, apparently in the interest of speeding up run times. For designs
with poor placement, the effects of poor routing are not as apparent as
they are for well placed (eg carefully floorplanned) designs. For my
floorplanned designs, I have seen a steady erosion in the max speeds
attainable by the tools on each new release since 4.1.
One of the biggest steps backward came from eliminating delay based
clean-up (IIRC that happened in 5.2). The result there is that the
tools just stop as soon as all routes meet timing. Every route in the
design is potentially a critical route. The routes to nearby
destinations often take circuitous routes that congest the routing
resources and unnecessarily drive the power dissipation up considerably.
With the current emphasis on power dissipation, I would think that the
Xilinx team would be looking at reinstituting the delay based clean-up.
Based on my empirical observations, that could pick up a 15-20%
improvement in power dissipation for designs that are clocked in the
upper half of the performance envelope. |
|
| Back to top |
|
 |
Austin Lesea
Guest
|
Posted:
Thu Dec 22, 2005 5:15 pm Post subject:
Re: Place and Route Algorithms: where's the fat? |
|
|
Jim,
Some comments,
Austin
-snip-
| Quote: | Austin, perhaps if you used engineering measurements for SW results,
rather than the words like "wizards" and "magic", then the SW might have
a chance to really improve with each release ?
|
The software group has a very rigorous quality of results metrics
(measurement) system for evaluating their work. I get to use the
superlatives, they do not.
| Quote: | I did wonder how Altera suddenly found power savings in SOFTWARE -
|
We still beat them on power, ask your FAE for the presentation. They
took a really lousy situation and made it just plain lousy. We still
have a 1 to 5 watt advantage, AFTER they run their power cleanup.
| Quote: | Given the enomous investment the companies claim, these field results
seem rather abysmal - seems the HW is carrying the SW ?.
|
Rather, the software is now (often) carrying the hardware. Very hard to
get the latest technology to be any faster than the previous one,
without architecture and software. If the software buys a speed grade,
that is all the customer cares about. The silicon get less expensive
with the shrink to the next generation. Who cares if the performance
came from software, hardware, or both?
| Quote: | Still, it does seem there is indeed a lot of 'fat' in Place & Route SW,
so we can expect further 'double digit improvement' claims.... :)
|
I agree. Until the tools do a better job than a room full of experts,
the tools are just not 'there'. Reminds me of compilers for high level
languages many years ago: there was a time I could write assembly code
that was faster, better, smaller, than any compiled high level language
(anyone recall PLM86?). Then after a while, the compilers got better
and better. Until finally, I had to agree that all that work was not
worth it: often the compiler yielded a better solution that my hand
written assembly code. |
|
| Back to top |
|
 |
Austin Lesea
Guest
|
Posted:
Thu Dec 22, 2005 5:15 pm Post subject:
Re: Place and Route Algorithms: where's the fat? |
|
|
Ray,
Some comments,
Austin
-snip-
| Quote: | What is missing is geographic relationships between parts of the
circuit. Perhaps the biggest piece missing in the current tools is
utilization of the hierarchy in a design.
|
As I said, there is a lot of room for improvement here. You are
assuming that the hierarchy is well done, and that the results from
working on each piece alone will do better. Just don't know if that is
true. Good area for work, I would agree.
| Quote: | Now, I do disagree with your assertion that each generation of the tools
improves both run time and quality of results.
|
I have to differ here. I understand your issues, but if we deal with
the ever expanding "standard suite" of test designs with better
performance, and better run time, I have to assert that it is better.
Is everything better? Of course not.
| Quote: | One of the biggest steps backward came from eliminating delay based
clean-up (IIRC that happened in 5.2).
|
I happen to agree with you here, my personal opinion is that the tools
should allow you to choose to go to the extra effort to find the best
paths, and not stop as soon as the aggregate requirements are met (or
stop and give up if it can't meet the requirements). I think you will
appreciate that what was done did provide for a much faster time to get
the design.
We do make the parts bigger every generation, and you may have noticed,
processor power is not keeping up anymore. |
|
| Back to top |
|
 |
Ray Andraka
Guest
|
Posted:
Fri Dec 23, 2005 12:16 am Post subject:
Re: Place and Route Algorithms: where's the fat? |
|
|
Jim Granville wrote:
| Quote: |
Yikes!
One wonders how _CAN_ SW make a carefully floorplanned design go
backwards ? By how much ?
Is that the lazy routing, being so bad, it actually finds a longer
path, than earlier SW ?
|
Enough to make so a design that passed timing with the earlier tools
will not pass timing no matter what you do with the newer tools short of
hand routing it. about 10% loss in performance average in each major
revision. There was a huge hit going to 5.2. 7.1 seems to have a much
smaller degradation from 6.3.
Yes, the routing got lazy so that it actually finds a longer path than
it did with earlier software. Quite often, it will not find the direct
connection to a neighboring cell, and instead routes it all over the
place, which adds delay, increases power consumption, and congests the
routing resources so that other nets also get a circuitous route so that
the overall timing is even further degraded.
| Quote: | I did wonder how Altera suddenly found power savings in SOFTWARE -
perhaps they now do exactly this, clean up messy, but timing legal,
routes ? Anyone in Altera comment ?
|
From what I understand, Altera is moving toward more delay based
clean-up. Xilinx has moved away from it, and is instead pursuing
capacitance based clean-up to reduce the power...which not only may miss
the mark, but also requires toggle rate information for each net. |
|
| Back to top |
|
 |
Ray Andraka
Guest
|
Posted:
Fri Dec 23, 2005 12:29 am Post subject:
Re: Place and Route Algorithms: where's the fat? |
|
|
Austin,
From what I have seen, folks who use hierarchy generally do a decent
job of it. You really have to work hard at making a hierarchical design
worse than a flat design. Hierarchy puts organization in the design, and
because crossing levels of hierarchy is a little bit painful, it forces
the designer to think in terms of components and to group related stuff
together. Even in a poor example of hierarchy, there is at least a
little bit of grouping done, and therefore information the tools can
use. I and others have been asking for hierarchical tools from Xilinx
for close to 15 years. I honestly don't think Xilinx understands why
using hierarchy is a good thing.
Austin Lesea wrote:
| Quote: | I have to differ here. I understand your issues, but if we deal with
the ever expanding "standard suite" of test designs with better
performance, and better run time, I have to assert that it is better. Is
everything better? Of course not.
|
Fine, but improvements shouldn't break existing designs. Nearly every
single one of my designs over the past 5 years gets better results with
the tool that was current at the start of the project than it does with
later versions of the tools. I could accept a low rate of recitivism,
but close to 100% is criminal. I know I am not the only "power user"
running into this, in fact it regularly comes up as a subject here at
each major release of the tools.
| Quote: | stop and give up if it can't meet the requirements). I think you will
appreciate that what was done did provide for a much faster time to get
the design.
|
Ummm, well no. The tools give you faster time to completion for a run
through the tools, but that doesn't help if the design does not meet the
timing requirements. It actually takes longer to complete a design
because you need to iterate on the place and route much more than when
there was a predictable routing solution for a good placement. Faster
completion in the tools does not equal faster time to get the design done.
| Quote: |
We do make the parts bigger every generation, and you may have noticed,
processor power is not keeping up anymore.
Yup, and Hierarchy can help you tremendously here. Routing complexity |
(and therefore effort needed) goes up with roughly the square of the
device size measured in LUTs, primitives, cells etc. By breaking it
down into hierarchical sub-assemblies, you end up with N smaller k/N
problems, so the effort is smaller than k^2. |
|
| Back to top |
|
 |
Jim Granville
Guest
|
Posted:
Fri Dec 23, 2005 1:16 am Post subject:
Re: Place and Route Algorithms: where's the fat? |
|
|
Austin Lesea wrote:
| Quote: | Jim,
Some comments,
Austin
-snip-
Austin, perhaps if you used engineering measurements for SW results,
rather than the words like "wizards" and "magic", then the SW might
have a chance to really improve with each release ?
The software group has a very rigorous quality of results metrics
(measurement) system for evaluating their work. I get to use the
superlatives, they do not.
|
Maybe they could ask Ray for some examples, so they can _find_ the
cases where the tools go backwards - instead of burying that in the
nonsense of statistical averages ?
Heck, they might even find that ALL designs benefit from the code
cleanup ?!
-jg |
|
| Back to top |
|
 |
Jim Granville
Guest
|
Posted:
Fri Dec 23, 2005 1:16 am Post subject:
Re: Place and Route Algorithms: where's the fat? |
|
|
Ray Andraka wrote:
| Quote: | Jim Granville wrote:
Yikes!
One wonders how _CAN_ SW make a carefully floorplanned design go
backwards ? By how much ?
Is that the lazy routing, being so bad, it actually finds a longer
path, than earlier SW ?
Enough to make so a design that passed timing with the earlier tools
will not pass timing no matter what you do with the newer tools short of
hand routing it. about 10% loss in performance average in each major
revision. There was a huge hit going to 5.2. 7.1 seems to have a much
smaller degradation from 6.3.
Yes, the routing got lazy so that it actually finds a longer path than
it did with earlier software. Quite often, it will not find the direct
connection to a neighboring cell, and instead routes it all over the
place, which adds delay, increases power consumption, and congests the
routing resources so that other nets also get a circuitous route so that
the overall timing is even further degraded.
|
Is the direct connection lost, because something else uses that path,
or could something as trivial as a 'length optimise' pass fix this ?
It does not sound like floor-planned nets, are being given first-bite
at the resource, either....
This does not sound like rocket science to fix, and if I were SW
quality manager, I would place this issue at the top of "very rigorous
quality of results metrics" Xilinx supposedly use....
Seems the Xilinx mindset thinks if things 'on average' are better,
those cases where it goes backwards, don't really matter...
| Quote: | I did wonder how Altera suddenly found power savings in SOFTWARE -
perhaps they now do exactly this, clean up messy, but timing legal,
routes ? Anyone in Altera comment ?
From what I understand, Altera is moving toward more delay based
clean-up. Xilinx has moved away from it, and is instead pursuing
capacitance based clean-up to reduce the power...which not only may miss
the mark, but also requires toggle rate information for each net.
|
Once the tools have met timing, wouldn't a simple length reduction
(which the place tools DO know) be a fast and efficent way to clean up
the lazy nets ? length should correlate pretty well with delay and
capacitance...
Users would tolerate if this power task took longer, even a weekend run
'shaker algorithm' - when the code is only nearly working, power is less
of a concern :)
-jg |
|
| Back to top |
|
 |
Andy Peters
Guest
|
Posted:
Fri Dec 23, 2005 1:16 am Post subject:
Re: Place and Route Algorithms: where's the fat? |
|
|
dp wrote:
| Quote: | The purpose of high level languages (for logic generation or
writing software) is to allow cheaper programming,
|
I thought it was to allow skilled engineers to accomplish more, to
allow re-use and to ease verification. I stand corrected.
| Quote: | the loss
factor I have witnessed has varied between 10 and >1000.
|
You might wish to provide some supporting details.
| Quote: | If one has the resources to do things at a lower level,
this is always the better choice. It does not take longer,
it does not cost more text (except for very low complexity
works where this is a non-issue anyway), it "only" takes
more skills.
|
You are shitting me. Do you think you can implement, say, an ethernet
stack in assembly code faster than someone can do the same job in C?
By "implement," I don't mean "write a bunch of code." Rather, I mean,
"debug and verify."
| Quote: | No translating tool can replace direct access
to the programmed hardware.
|
True, assuming more time is available to do the job.
-a |
|
| Back to top |
|
 |
Andy Peters
Guest
|
Posted:
Fri Dec 23, 2005 1:16 am Post subject:
Re: Place and Route Algorithms: where's the fat? |
|
|
Ray Andraka wrote:
| Quote: | From what I have seen, folks who use hierarchy generally do a decent
job of it. You really have to work hard at making a hierarchical design
worse than a flat design. Hierarchy puts organization in the design, and
because crossing levels of hierarchy is a little bit painful, it forces
the designer to think in terms of components and to group related stuff
together. Even in a poor example of hierarchy, there is at least a
little bit of grouping done, and therefore information the tools can
use. I and others have been asking for hierarchical tools from Xilinx
for close to 15 years. I honestly don't think Xilinx understands why
using hierarchy is a good thing.
|
C'mon, they don't understand why a hardware engineer would want to use
revision control or automated building (Makefiles) for designs. If
they did, the tools wouldn't spit files all over the place, and there
wouldn't be lossage like one tool requiring the part type given as
xc2s100e-ft256-6 and another needing the same info as xc2s100e-6-ft256.
Arrrrrgh ...
-a |
|
| Back to top |
|
 |
dp
Guest
|
Posted:
Fri Dec 23, 2005 1:16 am Post subject:
Re: Place and Route Algorithms: where's the fat? |
|
|
| Quote: | You are shitting me. Do you think you can implement, say, an ethernet
stack in assembly code faster than someone can do the same job in C?
By "implement," I don't mean "write a bunch of code." Rather, I mean,
"debug and verify."
|
Faster than C - yes. Assembly - which one do you mean? There are
worlds of differences between various assembly languages.
I personally use VPA (Virtual Processor Assembly) which I have
evolved over the years originating from the 68020 assembly. Today,
it may be more appropriate to call it a compiler - whatever you call
it, it makes me a lot more efficient than those who try to write
the same things in C. About a year ago I did a tcp/ip implementation,
it took me < 6 months to do it, starting with ppp all the way to
tcp through ip, clean uncompromised code - and
including the DNS service, ftp, smtp, various utilities etc.
About 150 source files, somewhat less than 2M
of text - debugged and everything. It does take advantage of the
environment it is runing in, of course, which is written using the
same tools or their predescessors (all this on a PPC platform).
Does that answer your question?
| Quote: | The purpose of high level languages (for logic generation or
writing software) is to allow cheaper programming,
I thought it was to allow skilled engineers to accomplish more, to
allow re-use and to ease verification. I stand corrected.
|
This is what most people believe - wrongly.
| Quote: | No translating tool can replace direct access
to the programmed hardware.
True, assuming more time is available to do the job.
|
Not necessarily. It does take less time when I am doing
the job, sometimes it has taken me less time to develop
some tooling and use it to do the job.
Dimiter
------------------------------------------------------
Dimiter Popoff Transgalactic Instruments
http://www.tgi-sci.com
------------------------------------------------------ |
|
| Back to top |
|
 |
Jeremy Stringer
Guest
|
Posted:
Fri Dec 23, 2005 1:16 am Post subject:
Re: Place and Route Algorithms: where's the fat? |
|
|
Phil Hays wrote:
| Quote: | On Wed, 21 Dec 2005 22:44:22 GMT, "John_H" <johnhandwork@mail.com
wrote:
My opinion is that the process of mapping separate from place & route is
archaic (to use kind words) and that spreading the logic out so each slice
has just one LUT is *not* the way to alleviate the problem.
Yes. Xilinx has added "map -timing" to do just that. Mappping logic
is now with placement, and the result works rather better.
|
<Shrug> I've seen -timing break a design's timing badly. (This was a
design that for some reason P&R'd a lot better at an effort level of
'medium' rather than 'high');
Jeremy |
|
| Back to top |
|
 |
Austin Lesea
Guest
|
|
| Back to top |
|
 |
Marc Randolph
Guest
|
Posted:
Fri Dec 23, 2005 9:15 am Post subject:
Re: Place and Route Algorithms: where's the fat? |
|
|
John_H wrote:
| Quote: | [...]
The tools *can* do so much more; the evolutionary development of the tools
has hampered true progress. The silicon is *amazing* in what can be
accomplished.
|
Agreed completely.
| Quote: | "Pushing the rope" to improve results with synthesis is bad
enough. Having place & route software that can't understand what it takes
to produce good results every time is sad. I can pass with total timing
compliance then lose by 1.5 nanoseconds after changing non-critical logic.
I prefer not to curse my tools.
|
Agreed completely (including Ray's points about newer versions not
being necessarily better than older ones).
To add a few datapoints behind John's comment about losing 1.5
nanoseconds after changing non-critical logic, we have been fighting
this type of thing at least every month (often more frequently), for
three years now. Always using the latest software version available
at the time, in all of our larger and/or higher speed designs (2V3000,
2VP7, 2VP40, and now LX25 designs), we have come to expect that it will
take many runs to stumble upon one that meets timing when we make
truely trivial changes (I'm talking about things that would make the
term "non-critical changes" look like massive changes... stuff like
changing a version ID or fixing an inversion problem going to a LUT).
I am not kidding or exaggerating in any way when I say that it's an
event worthy of commenting on to coworkers and minor celebration when a
change is made to one of the above designs and it meets timing the
first or second try. That just should not be the case.
BTW, the designs were done by different people with different styles.
The only thing in common is that they are all hierarchical, and they
all tend to use up a fair amount of the device (LUT utilization between
75 and 91%).
Marc |
|
| Back to top |
|
 |
Piotr Wyderski
Guest
|
Posted:
Tue Dec 27, 2005 11:37 pm Post subject:
Re: Place and Route Algorithms |
|
|
marco wrote:
| Quote: | Does anyone know where I can find information about the place and
route algorithms used for FPGAs
|
Globally optimal placement and routing is an NP-hard problem even for
multilayer PCBs, so all you can expect is only a set of approximations, if
you want to receive that placement in a reasonable time.
Best regards
Piotr Wyderski
--
"If you were plowing a field, which would you rather use?
Two strong oxen or 1024 chickens?" -- Seymour Cray |
|
| Back to top |
|
 |
|
|
|
|