In this article, we welcome our first guest writer. Tryggve Mathiesen, CTO at InformAsic writes about the challenges in the market and how Xilinx Zynq-7000 fits into the future of high-performance electronics.
Embedded Designers are asking for more
- More than a processor delivers…
- More than an ASIC or ASSP delivers…
- More than an FPGA delivers…
The programmable imperative is great, but the challenge in an FPGA for volume applications is to deliver the capacity at the power and cost of the ASIC and ASSP.
So, what markets and applications does the new architecture target? While you can and customers will find use for this new architecture in most markets and many applications,
we are focusing initially on a set of top and secondary markets.
The top markets and applications are:
- Automotive systems:
The top application within automotive systems is driver assistance.
This application is a showcase for the new architecture due to the needs of performance in the CPU and programmable logic
for real-time video processing, form factor, system power constraints, algorithmic maturation requiring hardware upgradability,
needs for differentiation with OEMs, needs to scale a platform for low-end to high-end cars,
are just many of the attributes that this architecture addresses. - Industrial applications:
Imaging includes smart surveillance and machine vision, industrial command/control and automation.
Industrial imaging often has similar characteristics to driver assistance. Command and control and automation has needs
for a variety of different networking standards, bridging, custom functions with real-time needs, etc. - Wireless infrastructure top applications:
Enterprise femto, remote radio heads and baseband processing.
These are looking for solutions to lower cost, system power and increase performance in an integrated solution. - Secondary markets and applications:
Aerospace and Defense where applications include targeting, guidance and navigation, surveillance, and secure communications. - For Audio-Video broadcasting, prosumer, professional and studio cameras, routers and switchers.
- In consumer we anticipate a variety of higher-end consumer applications.
This includes business-class multi-function printers, and we’re getting lots of good acceptance by these companies. - In medical, imaging and various diagnostic, monitoring and therapy devices.
- In wired communications such as edge cards has shown good interest for this architecture.
So, what markets and applications does the new architecture target?
While you can and customers will find use for this new architecture in most markets and many applications, we are focusing initially on a set of top and secondary markets.
But for both the top markets and the secondary we need to sort out how these applications are mapped onto the HW platform:
Why Use Processors
Microcontrollers (µC) and Microprocessors (µP) Provide a Higher Level of Design Abstraction. Most µC functions can be implemented using VHDL or Verilog – downsides are parallelism and complexity. Using C/C++ abstraction and serial execution make certain functions much easier to implement in a µC
A Microprocessor (µP) is just one component of many in a complex system of digital and analog I/O.
Most simple system components are contained completely within a Microcontroller (µC)
Discrete µCs are Inexpensive and Widely Used
µCs have years of momentum and software designers have vast experience using them
Hardware is Measured in MHz and Mbps. While higher MHz is good, bus bandwidth (Mbps) is typically a more important system characteristic
Processors are Measured in MIPs, DMIPs, etc. Million of instructions/second or Dhrystone MIPs are commonly used to specify software performance. The DMIPs of a processor core is dependent on the processor core’s efficiency and frequency – DMIPs/MHz. MCUs are typically < 1.0 while µPs are typically > 1.0
Bus infrastructure
Bus infrastructure ranges from logic efficient to maximum bandwidth.
Bi-directional Bus
Single channel of communication, least amount of logic required and moderate routing requirement.
Centrally Arbitrated
Full duplex communication, moderate amount of logic required and minimal routing requirement.
Slave-side Arbitration
N-channels of communication, most expensive logic requirement and extensive routing required.
Processor Use Models
State Machine
Lowest Cost, No Peripherals, No RTOS & No Bus Structures, Ex. VGA & LCD Controllers with Low/High Performance
Micro Controller
Medium Cost, Some Peripherals, Possible RTOS & Bus Structures, Ex. Control & Instrumentation with Moderate Performance
Custom Embedded
Highest Integration, Extensive Peripheral, RTOS & Bus Structures, Ex. Networking & Wireless with High Performance
Next-Generation Embedded System
What’s Needed is a New Class of Product!
Selections Equal Compromise
Conflicting Demands are Not Served.
What’s Needed is a New Class of Product!
Advanced SoCs to the Masses
With a new architecture, Xilinx brings together the ASIC/ASSP world with programmable logic in some very unique ways.
What do we mean by this? First, we start with the ARM CPUs we have chosen: the Cortex-A9MPCore. The ARM Cortex-A9MPCore is currently ARM’s highest performance application processor. It is a 32-bit, multi-issue, superscalar processor rated at 2.5 DMIPs per megahertz per core. We can call it complex as it includes two of the cores in a dual-core implementation. It also includes dual NEON SIMD processors with multi-media and DSP instructions, and single and dual precision floating point processors.
There are additional elements such as L1 and L2 caches, coherency support, etc. The key is that this is a fast, capable dual-core processor using the latest technology from the industry leader. Next, we have the high-bandwidth AMBA-AXI interfaces and interconnect. Based upon ARM released latest AMBA standard in March of 2010.
What is important to know now on this is that this new architecture uses not only this within the hardwired SoC portion, but it also is how the customer extends the SoC with the programmable logic. We have multiple AXI interfaces between the hardwired SoC and the programmable logic to address control, data, I/O and memory. This is a key aspect as to what makes this architecture unique.
Third, within the hardwired SoC there is a set of common peripherals that are used in a wide variety of embedded applications. These common peripherals include tri-mode Ethernet, USB 2.0, CAN, SDIO, I2C, SPI, UARTs, GPIOs, and other functions.
Fourth, there are hardwired memory controllers for both FLASH and dynamic, or DDR-type memories. These portions make up what we call the Processor System. As has been mentioned, the Processor System is hardwired. In other words it is a fixed implementation built like an ASIC. The Processor System in the new architecture is lower power and lower-cost as well as higher performance than if implemented using a combination of hardened blocks and FPGA logic.
By implementing a Processor System which is fully usable in a standalone manner, the architecture behaves as the customers expect and are familiar with. Software developers can start writing applications code and testing on real hardware immediately without a logic designer needed. The processor system is on its own power-plane and configures the programmable logic portion.
The programmable logic is how the customer creates their own custom SoC or ASSP. Using the programmable logic the customer adds their own mixture of accelerators, peripherals, data-path processors, I/O interfacing needs, whatever their specific application requires.
As it gets more expensive to develop devices on 28 nanometer, there are fewer and fewer ASSPs and ASICs on advanced technology nodes. However, the customer needs of more performance, lower power, lower cost and ability to differentiate are only increasing. The new architecture addresses this by enabling customers to develop their own custom device using advanced process technology without the cost and risks.
ASIC Gate count can vary in range from depending on functions utilized Xilinx LUT6 can range from 2 ASIC gates (when used as inverter) to over 100 ASIC gate equivalents when used as distributed memory (LUTRAM). Most designs show average mapping of 10-20 ASIC gates equivalent per Logic Cell. Note: DSP Performance based on 600MHz DSP48 block operation with pre-adder for symmetric operation, 2 MACs per clock cycle.
Zynq EPP SW Development Environment
Widely Used ARM Development Environment so it is easy to migrate code already developed for ARM-based systems. ARM Ecosystem Support, thus both ARM and Xilinx Software Development Kit (EDK) as well as other 3rd Parties exists today.
Widely Available SW and Libraries, both Open source and Commercially available.
An Application Example
A Backhaul Linecard with Zynq?
A traditional built linecard
A new revolutionary linecard
And here we thank Tryggve for his insights. The interested reader can hear Tryggve speak at FPGA World 2011. He can also be reached through InformAsic, where he works as CTO.