In this blog, I'm starting a mini-series on the place-and-route (P&R) software so critical to creating a working FPGA. Recently, I discussed the history of place-and-route software and algorithms with Sinan Kaptanoglu, a Microsemi fellow and chief FPGA fabric architect.
Sinan has many years of experience defining a variety of FPGA fabric architectures and their associated place-and-route algorithms. We started out talking about the key innovations that have taken us from waiting overnight for place-and-route results to just watching a short YouTube video of a pro StarCraft 2 game (typically around 15 minutes for those of you who aren't up to date on the eSports phenomenon).
Remember low utilization and long place-and-route times?
Sinan and I started out talking about how, in the early days of designing with FPGAs, getting the design to close (successfully route all the signals and meet some simple timing requirements) was a very big challenge -- both for the FPGA manufacturer and the customer. FPGA manufacturers had to trade off device capacity, place-and-route time, and the effort ("tweaking") required of the customer to optimize the design for the target architecture. Customers needed to minimize their design time, keep device utilization high (or be forced to buy a bigger, sometimes much more expensive device to fit their design), and meet their timing constraints.
Even architectures with plentiful routing resources, like anti-fuse devices, had difficulty with I/O placement. (Placing I/O signals at "problem" locations would make it difficult to successfully route signals and meet timing, since many "long lines" would be required to get signals back and forth across the chip. Long lines were a limited resource, since they completely occupied the routing channel.)
As FPGA manufacturers struggled to improve place-and-route algorithms, devices were growing dramatically in capacity. Unfortunately, the times required to successfully place-and-route FPGAs were directly related to the capacity of the device. Even more unfortunately, this relationship was not simply linear. As device capacity increased, eventually a place-and-route "mountain" was reached where run times increased dramatically. FPGA manufacturers were in a real bind.
This problem was sufficiently difficult for academic researchers to see it as a challenge. Many different proposals for improving place-and-route algorithms were proposed. One of them, first published in the mid-90s, "Pathfinder: A Negotiation-Based Performance Driven Router for FPGAs," L. E. McMurchie, and Carl Ebeling, (reprinted in the ACM International Symposium on Field Programmable Gate Arrays, 1995, page 111; and also available by clicking here) outlined a completely new approach for the place-and-route algorithms used in FPGAs.
As described in this paper's abstract:
This paper presents PathFinder, a router that balances the goals of performance and routability. PathFinder uses an iterative algorithm that converges to a solution in which all signals are routed while achieving close to the optimal performance allowed by the placement. Routability is achieved by forcing signals to negotiate for a resource and thereby determine which signal needs the resource most. Delay is minimized by allowing the more critical signals a greater say in this negotiation.
The key approach used by Pathfinder is to "over assign" signals to routing nets (route multiple signals on a single net). The algorithm then "negotiates" the extra signals off the "congested" net to eventually place them on other nets (ones with perhaps longer delays).
These negotiations are performed systematically, with the cost of using a "congested" signal gradually increasing until other signal paths are found (usually having longer delays) and the "congestion" is eventually eliminated. Because Pathfinder starts with a fully routed network (even if many signals are using a single resource), it offered a much more manageable and predictable approach for resource constrained FPGA architectures.
Take a look at the image below. Can you route "S1" to "D1," "S2" to "D2," and "S3" to "D3" with a minimal cost? Pathfinder can. (Hint -- start with all signals using node "B" and gradually increase the cost of using "B" on the congested signals to see what happens.)
FPGA manufacturers discovered that this algorithm works much better than the algorithms used previously -- ones that tried to route incomplete signal networks by "ripping up" a signal and "replacing" it with another. These trial and error approaches were notorious for working for a long time without finding a solution. Pathfinder provided a much more systematic approach and, by 2000, the Pathfinder technique of "negotiated congestion" was being used by virtually all FPGA place-and-route algorithms.
Sinan explained that with a good algorithm in hand, the FPGA architecture people could now "tune" the FPGA fabric more easily for timing-driven place-and-route. Previously, delay time could swing wildly based on the routing resources used. Some signals required multiple pass gates, and if the signal's delay was too long, a buffer could be inserted to try and improve matters. This type of architecture was silicon efficient, but large delay variations would dramatically reduce utilization.
A new, more silicon-intensive approach that used fully buffered signals started being explored. In this fabric, every wire has only one signal source. This is less flexible and costs more in silicon real estate, but it made timing-driven place-and-route much more predictable and dramatically increased utilization. Thus, it turned out that the resulting architecture was more silicon-efficient and met timing much more predictably.
So, the next time you run an FPGA compile chain from a high-level language through synthesis and the "back-end" FPGA tools, give a nod to the place-and-route algorithm that is doing much of the "heavy lifting." Now if we could just find a way to use negotiated congestion to solve our evening highway commute problems, (perhaps via mass transit?), life would be even better.
Do you have a favorite FPGA place-and-route horror story you want to share? Do you have any tricks you've used to successfully route your designs (like borrowing all the computers in the office and running your design over the weekend to try and get it to close)? Please post your comments below. And don't forget to tune in next week when we move on to consider today's challenges for FPGA place-and-route.
本视频基于Xilinx公司的Artix-7FPGA器件以及各种丰富的入门和进阶外设,提供了一些典型的工程实例,帮助读者从FPGA基础知识、逻辑设计概念
本课程为“从零开始大战FPGA”系列课程的基础篇。课程通俗易懂、逻辑性强、示例丰富,课程中尤其强调在设计过程中对“时序”和“逻辑”的把控,以及硬件描述语言与硬件电路相对应的“
课程中首先会给大家讲解在企业中一般数字电路从算法到流片这整个过程中会涉及到哪些流程,都分别使用什么工具,以及其中每个流程都分别做了
@2003-2020 中国电子顶级开发网