

# CHAPTER

## 1

## Zynq-7000 Adaptive SoC Fundamentals

<snip>

.... This chapter has so far provided a very general introduction to the Zynq-7000 Adaptive SoC. The next section takes a very different approach: the history of Xilinx FPGA technology will be chronicled to show how this programmable logic device with a powerful dual-core processor emerged. The narrative is quite long, starting right back with the release of the first Xilinx FPGA, the XC2064, in 1985, and ending with the release of Versal ACAP technology in 2019. Note, though, that it is not an essential section, and those with no interest in such history lessons can safely skip ahead to Section 1.3. (As an aside, Xilinx also has a long history of producing CPLDs and Configuration Memory devices, but the upcoming discussion will concentrate solely on FPGA technology.)

### 1.1 The Evolution of the Xilinx Zynq-7000 Adaptive SoC

To provide a useful starting point, Figure 1.1 shows the XC2064 [4] and the XC7Z020 side-by-side. The XC2064 was released in 1985, two and a half decades before the Zynq-7000, so there are of course considerable differences between the two. Even just in terms of device geometry, the XC2064 was manufactured in 2.5 $\mu$  technology using about 85,000 transistors, while the Zynq XC7Z020 is based on Xilinx 7-Series 28nm technology and uses about 1.3 million (ASIC-equivalent) gates. The XC2064 does, however, possess the fundamental characteristic of Xilinx FPGAs, Configurable Logic Blocks (CLBs), which are the primary logic design element. The first device has just 64 CLBs, though, compared to 6650 in the XC7Z020, or 34,675 in the highest specification Zynq (the XC7Z100).

Another major difference can be seen in the quantity of I/O pins — the XC2064 has 58 pins, while the XC7Z020 has up to 328 pins, and the XC7Z100 has up to 526 pins. Like the



**Figure 1.1. Layout comparison of the Xilinx XC2064, which was the first Xilinx FPGA, and the Xilinx XC7Z2020, a mid-range Zynq device.**

Zynq-7000, the XC2064 also includes programmable interconnect, another fundamental feature of FPGAs, but the similarities really stop there. Since the XC2064 was introduced, Xilinx devices have evolved to include a vast array of powerful features, as will be seen in this section.

The history of FPGA technology has been expertly characterized by Stephen M. Trimberger in terms of three ages: the Age of Invention (1984 to 1991), the Age of Expansion (1992 to 1999), and the Age of Accumulation (2000 to 2007) [7]. The history of Xilinx embedded solutions can also be charted using three different eras, as follows. First of all, we have the Pre-Embedded era from 1984-1999, when fundamental FPGA characteristics



**Figure 1.2. Time-line showing the major advancements in Xilinx technology in the pre-embedded era (1984-1999). (Only a small subset of Xilinx FPGAs are shown.)**

evolved and matured see Figure 1.3). Next comes the Platform FPGA era, from 2000 to 2009, when the IBM PowerPC was integrated in Virtex FPGAs, and softcore processors such as the MicroBlaze and PicoBlaze were introduced (Figure 1.6). From 2010 to the current time, then, we have the Zynq era (Figure 1.9). During this period, the Zynq-7000 was introduced (2011), followed by the UltraScale+ MPSoC in 2015, and the UltraScale+ RFSoC in 2017. In the last few years, Xilinx has been moving firmly into the world of highly-accelerated, reprogrammable computing with the release of Alveo data center cards and Versal ACAP devices.

The discussion starts with the pre-embedded era (Figure 1.3), which necessarily goes right back to the foundation of Xilinx.

### 1.1.1 The Pre-Embedded Era (1985-1999)

Xilinx was established in 1984 by three former Zilog employees, Ross Freeman, Jim Barnett, and Bernie Vonderschmitt [8]. Ross Freeman had been a Director of Engineering at Zilog, leaving in the early 1980s with the view of developing a new type of programmable logic device. A fellow engineer, Jim Barnett, saw enough potential in the idea that he also left Zilog to work with Freeman. The third founder, Bernie Vonderschmitt, was a veteran of the American electronics industry, having spent three decades with the 20th century giant, RCA Corporation, including serving time as GM and VP of the Solid State Division. After briefly working in Zilog in the early 1980s, Vonderschmitt joined with Freeman and Barnett, and Xilinx was accordingly founded in February 1984. Vonderschmitt brought a wealth of industry experience to complement the technical abilities of Freeman and Barnett, and they led a small team to design the first Field Programmable Gate Array (FPGA), the Xilinx XC2064. The main



Figure 1.3. Basic layout and main features of the Xilinx XC2064

designer of this first FPGA was Bill Carter (another former Zilog engineer), who was tasked with keeping the device as simple as possible. Even with tight engineering restrictions, the XC2064 still possessed the three fundamental elements of FPGAs, as follows (see also Figure 1.4 [9]):

- Configurable Logic Block (CLB):** The CLB was the main programmable feature in the device, containing a combinational block (sometimes called a function generator or look-up table) and a single flip-flop. This allowed the user to implement logic functions with four inputs (or combine CLBs to implement different functions), along with a D-type flip-flop or level-sensitive latch in the output path. The XC2064 had 64 CLBs, and a companion device (the XC2018) had 100 CLBs.
- Programmable I/O blocks:** These were used to route signals between the device pins and the internal logic, and included a flip-flop in the input path to store the incoming data. The output levels were 5V TTL/CMOS, and the drive capability was 4mA.
- Programmable Interconnect:** The third fundamental feature was programmable interconnect, which was used to connect the CLBs and I/O blocks together to implement the digital design.

There were some problems when the first XC2064 silicon wafers returned to the lab — many devices suffered from fatal shorts — but one wafer was rescued, allowing the product to



**Figure 1.4. Xilinx XC6200, released in 1995. Arguably the first Xilinx FPGA specifically targeting embedded platforms.**

be evaluated. A second wafer run was more successful, and the XC2064 was officially released on November 1st, 1985.

The XC3000 family [10] was the follow-up to the XC2064 in 1987 — this was effectively a more mature version of the first product, with up to 484 CLBs and 176 I/O pins. It was really the XC4000 family [11], introduced in 1991, that was the most significant range in the early days of Xilinx. The largest device, the XC4085XL, contained 3136 CLBs and 448 I/O pins, and it also introduced some other important features — for example, the CLBs could be programmed as SelectRAM memory (providing distributed RAM in the FPGA for the first time), and dedicated carry chains provided major enhancements in arithmetic functionality. The I/O blocks could be programmed with variable output slew rates and internal pull-up/pull-down resistors, and they were also PCI compliant.

XC4000 devices were also used as coprocessors in CPU-based designs, which showed the early potential of FPGAs in embedded systems. However, the XC4000 still wasn't specialised enough for this purpose, and in 1995 Xilinx introduced the XC6200 family, marketed as "*An SRAM-based FPGA architecture specifically designed for implementing reconfigurable coprocessors in embedded system applications*" [12] (see Figure 1.4). This included a dedicated CPU interface called FastMAP, high-capacity distributed memory (up to 256KB), and ultra-fast reconfiguration (which was very beneficial for coprocessor functionality). This was in the days before SoC techniques had really gained prominence, meaning the XC6200 was placed as a separate component on the PCB along with the CPU, memory and I/O devices; still, it can be argued that this was the first Xilinx embedded product.

In 1998, Xilinx introduced the Spartan and Virtex families [13] — the Spartan range was still based on XC4000 technology, but the new Virtex was a much more evolutionary product. Hardened Block RAM structures were included in this FPGA, dramatically improving on-chip memory support. Dedicated multipliers were available in the CLBs, meaning that it was much easier to implement DSP algorithms. The I/O blocks were now called SelectIO, and they supported memory standards such as GTL+ and SSTL3, along with regular standards like PCI,

LVC MOS and LV TTL. Four delay-locked loop (DLL) blocks and dedicated global interconnects were available for much improved device clocking. As well as this, an I/O bank scheme provided flexibility in choosing the new I/O standards, and separate core (VCCINT) and I/O (VCCO) power supplies were introduced.

A year later, the Virtex-E was released [14]; this device was notable for its support of 622 MHz serial differential I/O technology. Serial I/O techniques had improved significantly in the 1990s, and were poised to start replacing the parallel I/O techniques that had been standard up to that point. (Parallel I/O was limited in speed, used up lots of pins, and was noisy and power-hungry; Serial I/O effectively solved all of these problems.) The Virtex-EM supported LVDS, Bus LVDS, and LVPECL, and it could be used to implement 10 Gbps OC-192.

The features in the Virtex and Virtex-EM FPGAs were not specifically related to embedded solutions, but it was clear that Xilinx devices were becoming powerful enough to support such architectures. This would happen in the Platform FPGA era (Figure 1.6, discussed next).

### 1.1.2 Platform FPGA Era (2000-2009)

The first indication that Xilinx FPGAs were about to become more processor-friendly came in 2000, when a partnership between Xilinx and IBM was announced [15]. The main initiative was to embed hardened versions of the IBM PowerPC processor in upcoming Virtex FPGAs, along with support for the related IBM CoreConnect bus architecture. Xilinx released the Virtex-II in 2001 [16], although this first FPGA did not include the PowerPC. It did, however, support a new Xilinx softcore processor called MicroBlaze [17], which was compatible with the



Figure 1.5. The Platform FPGA Era (2000-2009), when PowerPC processors were embedded in Xilinx FPGAs, and softcore processors were introduced.



**Figure 1.6. Block diagram of the PowerPC 405 system implemented in the Virtex-II Pro FPGA, released in 2002.**

IBM CoreConnect bus protocol. The MicroBlaze was a 32-bit Harvard-style RISC processor, capable of running at 150 MHz/100 DMIPs. The Virtex-II also included some other interesting additions: embedded 18x18 multipliers for enhanced DSP support, digitally-controlled impedance (DCI) for I/Os (which reduced the need for external termination resistors), Digital Clock Managers (DCM) for improved global clocking, and QDR memory support.

It was in 2002 that hardened embedded processors were seen for the first time in Xilinx FPGAs, with (up to) two PowerPC 405 processors available in the Virtex-II Pro FPGA [18] (Figure 1.7). This IBM core was a 32-bit 300 MHz Harvard RISC processor with a five-stage pipeline. The instruction and data caches were 16 KB in size, and a memory management unit (MMU) was included for virtual memory support. As well as the dual-embedded processors, the Virtex-II Pro also featured Multi-Gigabit Transceiver tiles called RocketIO — these circuits included essential serial I/O features such as CDR (Clock and Data Recovery) and 8B/10B encoding/decoding, making it much easier to implement serial standards like SATA and 10 Gb Ethernet.

In 2003, Xilinx introduced another softcore processor, the PicoBlaze [19]. This was a very efficient 8-bit microcontroller, originally designed by Xilinx engineer Ken Chapman as a Constant (K) Coded Programmable State Machine (KCPSM). While it was less powerful than the 32-bit MicroBlaze introduced two years earlier, it had the advantage of taking up very



**Figure 1.7.** Block diagram of the PowerPC 440 system implemented in the Virtex-5 FXT FPGA, released in 2007.

little space in the FPGA, offering an efficient alternative when the higher performance of the MicroBlaze (or PowerPC 405) was not required.

In terms of product naming, Xilinx did not release a Virtex-3, but instead went straight to a Virtex-4. This was introduced in 2004 in three different flavours [20]: LX for high-performance logic; SX for signal processing; and FX for embedded processing/high-speed serial connectivity. The PowerPC 405 was again the processor of choice, and its operating frequency had increased from 300 MHz to 450 MHz. It also included a new Auxiliary Processor Unit (APU) which could connect to suitable coprocessors in the FPGA fabric (such as a floating point unit). In terms of other improvements, the high-speed serial I/O rate increased to 6.5 Gbps, and hardened Ethernet MACs were also included. In addition, the DSP slice in Virtex-4 was enhanced by the inclusion of a 48-bit accumulator, and it was now named DSP48, a forerunner to the DSP slices used in the 7-Series.

The next major advancement in the Platform FPGA era came in late 2006 with the arrival of the Virtex-5 family [21]. Like the Virtex-4, the new FPGA was available in different flavours, with the FXT range [22] targeted at embedded solutions (Figure 1.8) (although this version was not released until 2008). The IBM PowerPC 440 was now the processor of choice; this core could run at 550MHz and featured wider buses (32-bit instruction, 36-bit address, and 128-bit data), along with caches that had doubled in size. The core architecture was an out-of-order seven-stage pipeline, and it could handle two instructions per cycle. (These terms will become clearer in Chapter 2.) The interface to the FPGA fabric was also overhauled with the inclusion of a ‘Crossbar’ switch; this addressed an issue that hampered the earlier PowerPC 405 implementations. (To elaborate, timing closure on the Virtex-II Pro and Virtex-4 was dependent on the FPGA circuit, and would be inconsistent from design to design. The hardened Crossbar interface in the Virtex-5 FXT significantly improved this issue.)

The Virtex-5 family was also interesting in that the programmable logic features were very similar to what would eventually appear in the Zynq-7000. For example, the Block RAMs were 36 KB in size (i.e. 2x compared to Virtex-4, and equal to the Zynq BRAM size), and the

DSP slice now featured a 25x18 multiplier as well as a 48-bit accumulator. Clocking was provided by Clock Management Tiles (CMT), which contained Digital Clock Management (DCM) and analog PLL logic. The high-speed serial I/O functionality was now served by two different methods — a low-power GTP option running at 3.125 Gbps, and a high-speed GTX option running at 6.125 Gbps. Also, a system monitoring block was introduced — this used a 10-bit 200 kSPS ADC to measure external analog signals, and collect FPGA voltage and temperature information.

As it turned out, the Virtex-5 FXT with embedded PowerPC 440 processors was the pinnacle of the Platform FPGA era. In 2009, the Virtex-6 family was released [23], offering incremental changes over the Virtex-5 range: PCIe Gen 2 was supported; serial I/O rates increased to 11 Gbps; a Mixed-Mode Clock Manager (MMCM) was used for improved clocking; and the DSP slice (called DSP48E1) supported Single Instruction Multiple Data (SIMD) functionality.

Significantly, though, Virtex-6 FPGAs did not include a hardened processor of any kind, indicating that a change in direction was coming.

### 1.1.3 The Zynq Era (2010-)

This change in direction became apparent in 2010 when Xilinx announced a deal with Arm Ltd. to use the Cortex-A9 in a new embedded solution, known at the time as the Extensible Processing Platform (EPP) [24]. One of the most notable aspects of the proposed approach was that, unlike the Virtex embedded solutions, the processor was now considered the central component, with the programmable logic available for high-performance tasks where necessary.

The EPP solution was based on the advanced Xilinx 7-Series technology that was also announced in 2010 [25]. Up to that point, the (low-cost) Spartan and (high-performance) Virtex families had been built on different technologies, making it difficult for customers to scale a solution from Spartan to Virtex. The 7-Series eliminated this restriction by building three distinct products on the same 28nm technology. The three families were as follows:

- **Artix-7:** Lowest cost and power for simple solutions, effectively replacing Spartan-6. (As an aside, a true Spartan-7 device would eventually be released in 2017 [26], also marketed as a cost-effective device.)
- **Kintex-7:** A new price/performance-optimised FPGA for mid-level solutions.
- **Virtex-7:** The highest-performance/capacity devices, in keeping with the long-established Virtex brand.

The 7-Series FPGAs and the Zynq-7000 were both released in 2011 [27]. As mentioned earlier, the original Zynq-7000 devices all featured a dual-core Arm Cortex-A9 MPCore, with single-core devices added to the range in 2016 [28]. The Cortex-A9 was based on the ARMv7A architecture, and came with a range of optional extensions; the Zynq-7000 implementation included NEON SIMD and floating-point support, TrustZone security extensions, and ARM CoreSight Debug and Trace functionality. The Level 1 instruction and data caches were 32 KB each, and the Level 2 cache was 512 KB. DDR2 and DDR3 dynamic memory was supported, along with NAND, NOR and QSPI static memory options. The range of I/O peripherals included USB2, Ethernet and SDIO, along with low-speed interfaces such as I2C, SPI, CAN, UART, and, of course, GPIO.



Figure 1.8. The Zynq Era (2010-), from the arrival of the Zynq-7000 Adaptive SoC (based on Xilinx 7-Series technology) to the ACAP era.

The Zynq-7000 range was split into two distinct brackets: a cost-optimised range based on the Artix-7, and a higher-performance range based on the Kintex-7. (In terms of current Xilinx embedded FPGA offerings, the Kintex-7-based Zynq is deemed a mid-range device.) The 7-Series continued the evolution of Xilinx FPGAs by including serial transceivers with speeds of up to 28.5 Gbps, PCIe Gen3 support, enhanced DSP48E1 blocks, a dual 12-bit 1 MSPS XADC, and two different types of I/O bank (1.2V-1.8V high-performance banks, and 1.2V-3.3V high-range banks).

The Zynq-7000 will be covered in more detail in the next section (and, indeed, in the rest of this book), but, for completeness, the remainder of this section will be devoted to the advancements in Xilinx technology during the Zynq era, eventually leading to the arrival of ACAP technology at the end of the decade. To begin with, two new major technologies were announced in 2013: UltraScale (20nm), and UltraScale+ (14nm/16nm) [29]. Xilinx even

released the first 20nm device that year, the Kintex UltraScale XCKU040. In terms of programmable logic evolution, the UltraScale range added serial I/O rates of up to 30.5 Gbps, 100G Ethernet, and 150G Interlaken (a very high-bandwidth packet protocol developed by Cisco Systems and Cortina Systems [30].

The first UltraScale+ devices shipped in 2015 [31], featuring 58 Gbps high-speed serial I/O and PCIe Gen4 support. The Virtex devices in the UltraScale+ range also included a new type of programmable logic memory called UltraRAM (in addition to the long-established Distributed RAM and BlockRAM memory structures). Compared to BlockRAM, UltraRAM is a higher-capacity, space-efficient memory, providing up to 360 Mb of on-chip storage. (For comparison, the individual capacity of Block RAM is 36 Kb, while the capacity of UltraRAM is 288 Kb.) High-Bandwidth Memory (HBM) was also introduced in UltraScale+ devices, where DRAM was integrated in the same package as the FPGA, providing up to 16 GB capacity and 460 GB/s bandwidth. (By comparison, DDR4 technology has a bandwidth of 21.3 GB/s.)

Also in 2015, Xilinx released the first Zynq UltraScale+ MPSoC device, the XCZU9EG [32]. In addition to the advanced programmable logic features just discussed, this new device was equipped with a powerful heterogeneous processing system, offering application, real-time, and graphic processor options. Depending on the chosen device, the following configurations were possible: Dual- or Quad-Core ARM Cortex-A53 for application-level processing, Dual-Core ARM Cortex-R5 for real-time processing, and ARM Mali-400 GPU for graphical processing. In 2017, the UltraScale+ RFSoC was released [33], adding high-speed 12-/14-bit RF-ADCs and 12-bit RF-DACs to the high-end Zynq range.

As mentioned earlier, the line-up of the Zynq-7000 cost-optimised range was enhanced in 2016 with the addition of three single-core devices (the XC7Z007S, XC7Z012S, and XC7Z014S). These devices still contained the Arm Cortex-A9 MPCore, but only one core was enabled in the cluster. There was also an interesting development in the soft processor range in 2018, when Arm Cortex-M1 and Cortex-M3 microcontrollers were made freely available to Xilinx developers. This was through an initiative called Arm DesignStart FPGA [34], where release-quality M1 and M3 IP cores could be added to an FPGA design, just like the native MicroBlaze processor.

Since then, Xilinx has been steadily moving into the world of highly-accelerated reprogrammable computing, and some of the main advancements are summarised below. First of all, the Alveo range of high-performance accelerator cards for data centres was released in 2018 [35], and then, in the same year, the Adaptive Compute Acceleration Platform (ACAP) was announced [36]. The first ACAP product was the Versal AI Core, released in 2019 [37]; the Prime Series and Premium Series have since been added to the range. Built on 7nm TSMC technology, these devices feature several traits already evident in the Zynq high-end range, such as dual-core application and real-time processors (Arm Cortex-A72 and Cortex-R5F in this case), and super-high-speed serial transceiver technology. A large amount of on-chip memory is also available (up to 994 Mb), and the programmable logic section is now composed of Adaptive Engines, AI Engines, and DSP Engines. A Network-on-Chip (NOC) interface connects everything together on the device, providing an aggregate bandwidth of 1 Tb/s [38]. Finally, the software development environment was also radically overhauled, with the release of the Vitis unified software platform in 2019 — the discussion of this new software framework continues in Section 1.5.1.

## 1.2 Zynq-7000 Development Tools

<snip>

### 1.2.1 A Broad History of Xilinx Development Tools

The release of the XC2064 in 1985 was accompanied by a development system called Xilinx Advanced CAD Technology (XACT) — this provided features such as schematic entry, place-and-route, PROM programming and timing/logic simulation. While the tool could be used for all steps of the design flow, it was also classed as an open system [50], as a wide variety of design entry and verification tools were available from third-party vendors (or syndicate partners, as they were better known at the time). Xilinx, though, always preferred to keep the place-and-route part of the flow under their control, as this was one area where they could really push device performance using superior in-house knowledge. Indeed, this model still stands today, where third-party tools are available for several stages (design-entry, synthesis, and simulation), but implementation is carried out in a Xilinx IDE (currently Vivado HLx).

By the mid-1990s, the PC had become a more user-friendly machine for electronic circuit design, and in 1996 Xilinx announced a new PC software package called Xilinx Foundation Series [51]. This featured a fully-integrated GUI for FPGA design, and, while the implementation section of the tool was still XACT-based, it also included much-improved synthesis and simulation options. The use of hardware description languages was also becoming very common for digital design (and indeed was almost mandatory for large circuits), and there was an increased focus on VHDL design in the Foundation Series. As was the case in the original XACT development flow, the user could still use third-party tools, but this time they had to license an alternate version of the software, the Xilinx Alliance Series.



**Figure 1.9. A broad summary of Xilinx development tools from 1985 to 2019. (Note that only a small subset of ISE releases is shown here.)**

By the time the third version of the Foundation software arrived in 2000, it had morphed into the ISE Foundation Series (ISE 3.1i) [52]. (ISE stands for Integrated Synthesis Environment.) ISE would go through eleven major version iterations before its final release in 2013 (as ISE 14.7), and the software is still in use today for older Xilinx devices.

The life-span of ISE coincided with the Platform FPGA era (Section 1.2.2), and support was added for embedded system design as the software evolved. For example, the release of ISE 5.1i in 2003 saw the addition of the Embedded Development Kit (EDK), which included Xilinx Platform Studio (XPS) [53] [54]. This solution provided a HW/SW co-design environment in which PowerPC and MicroBlaze systems could be developed. A GCC compiler and GDB debugger were central to the software development flow, and third-party RTOS and Linux support was also available. By the time ISE 12 was released at the end of the Platform FPGA era (2010), a dedicated Embedded Edition was available which included the following tools: Xilinx Platform Studio and Base System Builder (BSB); a Core Generator package for adding Xilinx/third-party/custom IP; an Eclipse-based CDT SDK (C/C++ Development Tooling Software Development Kit); and a programming tool called iMPACT. (See [55] for details on using ISE 12 to design PowerPC 440 systems on the Virtex-5 FXT.)

However, after being around for almost a decade and a half, ISE was not really equipped to deal with the 20nm (and lower) FPGAs in the Xilinx roadmap, and Xilinx unveiled the Vivado Design Suite in 2012 [56]. At its core, this new tool featured a vastly superior engine for all parts of the FPGA design flow, ensuring that it could handle the multi-million (ASIC-equivalent) gate designs that Xilinx was targeting. A wide range of design entry options was available (C/C++, SystemC, VHDL, Verilog, SystemVerilog, Matlab/Simulink), and High-Level Synthesis (HLS) was more tightly integrated in the tool, making that approach more attractive. It was also easier to carry out FPGA design at a system level with the inclusion of a new block-diagram feature called IP Integrator, along with a superior IP Catalogue based on the IP-XACT XML format. Both of these features were especially suited to processor-based system development.

Vivado also had a clear advantage over the earlier embedded editions of ISE. To elaborate, when using ISE, the designer typically had to use five different sub-tools: XPS, BSB, Core Generator, SDK and iMPACT. When Vivado was released, the developer just needed IP Integrator (for processor and logic design), and SDK (for software design). The HW/SW co-simulation and co-debug features were also markedly improved in Vivado.

Two other important milestones must also be mentioned in the Xilinx software time-line. The first of these occurred in 2015, when Xilinx announced the SDx development environment, consisting of three highly-abstracted tools [57]. The first tool was SDNet, which in fact had been released a year earlier — this provided a high-level environment where network engineers could create packet processing systems for software-defined network architectures [58]. The second tool was SDAccel — this allowed developers to use C, C++ or OpenCL to create highly-accelerated applications for Xilinx FPGAs in data center and cloud computing systems. The third tool, and the one of most interest to embedded engineers, was SDSoc — this allowed software engineers to create systems in C or C++ for Zynq-7000 and UltraScale+ MPSoC platforms. While this might sound like the HLS option already integrated in Vivado, SDSoc offered a major benefit over HLS — it was easier for software developers to implement the complete flow in the new tool without having to understand the hardware (FPGA) details, or get assistance from a hardware engineer who did.

The second major milestone was the release of the Vitis unified software platform in 2019 — this effectively incorporated SDK, SDSoc, SDAccel, and Vivado HLS in a single package, providing three principle work-flows. By far the most powerful flow is the Vitis Environment for Acceleration [59], which replaced SDAccel, and also absorbed the SDSoc tool. This environment provides a wide range of highly-accelerated libraries that operate in conjunction with the core Vitis engine, a Xilinx Runtime Engine (XRT), and a Xilinx target platform. Applications can be developed using languages such as C, C++ or Python, or even using higher level frameworks such as Caffe (deep learning) or Tensor Flow (machine learning).

Vitis HLS [60] is next, providing an improved high-level synthesis flow. (It can also be configured to run in Vivado HLS mode [61], although use of the newer tool is recommended.)

Of most relevance to the reader of this text, though, is the Vitis Embedded Software Development flow[62], which provides a new embedded environment to develop bare-metal, RTOS, and Linux applications. Just like SDK, the Vitis IDE is based on Eclipse CDT (and includes similar features), but it is easier to target complex, heterogeneous Xilinx devices like the UltraScale+ MPSoC using the new tool. The fundamental hardware/BSP model in SDK is replaced with a platform/domain model, allowing related applications to run simultaneously as part of a system project. Running bare-metal applications for the Zynq-7000 is also a simple process once the new concepts are understood. More details on the Vitis software platform are given in Section 5.4.2 in Chapter 5.

## 1.3 References

All internet links accessed January 2021.

- [4] The Programmable Gate Array Data Book 1992, Xilinx.
- [5] Three Ages of FPGAs: A Retrospective on the First Thirty Years of FPGA Technology, Stephen M. Trimberger, March 2015.
- [6] Fabless: The Transformation of The Semiconductor Industry. 2013. Nenni, Daniel; McLellan, Paul. (A SemiWiki.com project)
- [7] The Programmable Data Book 1996, Xilinx.
- [8] XCELL The Quarterly Journal for Xilinx Programmable Logic Users, Issue 18, Third Quarter 1995.
- [9] XCELL The Quarterly Journal for Xilinx Programmable Logic Users, Issue 27, First Quarter 1998.
- [10] XCELL The Quarterly Journal for Xilinx Programmable Logic Users, Issue 34, Fourth Quarter 1999.
- [11] XCELL The Authoritative Journal for Xilinx Programmable Logic Users, Issue 37, Fall 2000.
- [12] XCELL The Authoritative Journal for Xilinx Programmable Logic Users, Issue 40, Summer 2001.
- [13] XCELL The Authoritative Journal for Xilinx Programmable Logic Users, Issue 42, Spring 2002.
- [14] XCELL The Authoritative Journal for Xilinx Programmable Logic Users, Issue 45, Spring 2003.
- [15] XCELL The Authoritative Journal for Xilinx Programmable Logic Users, Issue 51, Winter 2004.
- [16] XCELL The Authoritative Journal for Xilinx Programmable Logic Users, Issue 59, Fourth Quarter 2006.
- [17] XCELL The Authoritative Journal for Xilinx Programmable Logic Users, Issue 64, Second Quarter 2008.
- [18] XCELL Solutions for a Programmable World, Issue 68, Third Quarter 2009.
- [19] XCELL Solutions for a Programmable World, Issue 71, Second Quarter 2010.
- [20] XCELL Solutions for a Programmable World, Issue 72, Third Quarter 2010.
- [21] Xilinx Spartan-7 FPGAs Now in Production, Xilinx Press Release, May 09 2017.

- [22] XCELL Solutions for a Programmable World, Issue 75, Second Quarter 2011.
- [23] Xilinx Extends its Cost-Optimized Portfolio, Xilinx Press Release, Sep 27 2016.
- [24] XCELL Solutions for a Programmable World, Issue 85, Fourth Quarter 2013.
- [25] Interlaken Protocol Definition - A Joint Specification of Cortina Systems and Cisco Systems, Revision 1.2, October 7, 2008.
- [26] Xilinx Stays a Generation Ahead at 16nm, Xilinx Press Release, Feb 23 2015.
- [27] Xilinx Ships Industry's First 16nm All Programmable MPSoC Ahead of Schedule, Xilinx Press Release, Sep 30 2015.
- [28] Xilinx Delivers Zynq UltraScale+ RFSoC Family, Xilinx Press Release, Oct 03 2017.
- [29] Arm DesignStart FPGA, [www.arm.com](http://www.arm.com).
- [30] Xilinx Launches the World's Fastest Data Center and AI Accelerator Cards, Xilinx Press Release, Oct 02 2018.
- [31] Xilinx Unveils Revolutionary Adaptable Computing Product Category, Xilinx Press Release, Mar 9 2018.
- [32] Xilinx Hits Milestone with First Customer Shipments of Versal ACAP, Xilinx Press Release, June 18 2019.
- [33] Versal: The First Zynq-7000 Adaptive SoC Fundamentals Adaptive Compute Acceleration Platform (ACAP), Xilinx White Paper WP505 (v1.1.1), September 29 2020.
- [34] Cortex-A9 MPCore Technical Reference Manual (r3p0), ARM DDI0407G.
- [35] ARM Generic Interrupt Controller Architecture Specification (v1.0), ARM IHI0048A.
- [36] ARM Architecture Reference Manual ARMv7-A and ARMv7-R edition, ARM DDI0406C.
- [37] ARM Cortex-A Series Programmer's Guide, ARM DEN0013D.
- [38] CoreLink Level 2 Cache Controller (L2C-310) Technical Reference Manual (r3p3), ARM DDI0246H.
- [50] Vitis High-Level Synthesis User Guide, Xilinx UG1399 (v2020.1), June 3, 2020.
- [51] Vitis HLS Migration Guide, Xilinx UG1391 (v2020.2) November 24, 2020.
- [52] Vitis Unified Software Platform Documentation - Embedded Software Development, Xilinx UG1400 (v2020.1) June 3, 2020.
- [53] UltraFast Embedded Design Methodology Guide, UG1046 (v2.3) April 20, 2018.

- [54] Zynq-7000 SoC (Z-7007S, Z-7012S, Z-7014S, Z-7010, Z-7015, and Z-7020): DC and AC Switching Characteristics, Xilinx DS187 (v1.21) December 1, 2020.
- [55] Zynq-7000 SoC (Z-7030, Z-7035, Z-7045, and Z-7100):DC and AC Switching Characteristics, Xilinx DS191 (v1.18.1) July 2, 2018.
- [56] Zynq-7000 SoC Packaging and Pinout Product Specification, UG865 (v1.8.1) June 22, 2018.
- [57] Zynq-7000 SoC PCB Design Guide, Xilinx UG933 (v1.13.1) March 14, 2019.
- [58] Zynq-7000 All Programmable SoC Software Developers Guide, Xilinx UG821 (v12.0), Sept. 30 2015.
- [59] Embedded System Tools Reference Manual, UG1043 (v2019.2) October 30, 2019.
- [60] Xilinx Standalone Library Documentation - OS and Libraries Document Collection, Xilinx UG643 (v2020.2) November 24, 2020.
- [61] Zynq-7000 SoC: Embedded Design Tutorial - A Hands-On Guide to Effective Embedded System Design, Xilinx UG1165 (v2020.1) June 10, 2020.
- [62] Vivado Design Suite Tutorial Embedded Processor Hardware Design, Xilinx UG940 (v2020.1), December 14, 2020.

