

# **OpenCPI**

# **Component Development Guide**

**DRAFT**

**Authors:**

Michael Pepe, Mercury Federal Systems, Inc. ([mpepe@mercfed.com](mailto:mpepe@mercfed.com))

Jim Kulp, Parera Information Services, Inc. ([jkulp@parera.com](mailto:jkulp@parera.com))

### *Revision History*

| <b>Revision</b> | <b>Description of Change</b>                                                             | <b>Date</b> |
|-----------------|------------------------------------------------------------------------------------------|-------------|
| 1.01            | Creation                                                                                 | 2010-06-21  |
| 1.02            | Add ocpiasca information and HDL application information                                 | 2010-08-05  |
| 1.03            | Add more detail to HDL building, and general editorial improvements                      | 2011-03-28  |
| 1.1             | Editing, platform and device aspects of ocpiigen, HDL details                            | 2011-08-01  |
| 1.2             | Update for latest HDL details                                                            | 2013-03-12  |
| 1.3             | Rename document, add VHDL coding details and ocpihdl utility                             | 2013-06-15  |
| 1.4             | Add new and complete assembly and container info, and add two new ocpihdl functions      | 2013-11-06  |
| 1.5             | HDL coding practices, more HDL platform information                                      | 2013-12-10  |
| 1.6             | Convert to ODT format, add more clarity for parameter properties and readonly properties | 2014-03-31  |
| 1.7             | Use standard template., minor updates, add generic OWD content.                          | 2015-02-27  |
| 1.8             | Incorporate the authoring model reference content into this document                     | 2015-10-31  |

## Table of Contents

|              |                                                                                      |           |
|--------------|--------------------------------------------------------------------------------------|-----------|
| <b>1</b>     | <b>References.....</b>                                                               | <b>5</b>  |
| <b>2</b>     | <b>Overview.....</b>                                                                 | <b>6</b>  |
| <b>3</b>     | <b>HDL Workers.....</b>                                                              | <b>8</b>  |
| <b>3.1</b>   | <b>Building the worker.....</b>                                                      | <b>9</b>  |
| <b>3.2</b>   | <b>Worker Makefile.....</b>                                                          | <b>10</b> |
| <b>3.3</b>   | <b>HDL Worker Description File: the OWD XML file.....</b>                            | <b>11</b> |
| <b>3.3.1</b> | <b>HDLWorker Top Level Attributes.....</b>                                           | <b>11</b> |
| <b>3.3.2</b> | <b>Attributes of Worker Interfaces.....</b>                                          | <b>13</b> |
| <b>3.3.3</b> | <b>Clocking Elements and Attributes.....</b>                                         | <b>14</b> |
| <b>3.4</b>   | <b>The Authored Worker: the VHDL “architecture” or Verilog “module body”.....</b>    | <b>15</b> |
| <b>3.5</b>   | <b>Signal naming conventions and data types.....</b>                                 | <b>17</b> |
| <b>3.6</b>   | <b>The Control Interface to the HDL worker.....</b>                                  | <b>18</b> |
| <b>3.6.1</b> | <b>Reset behavior and initializations in OpenCPI workers and infrastructure.....</b> | <b>18</b> |
| <b>3.6.2</b> | <b>Clock and reset input signals in the control interface.....</b>                   | <b>18</b> |
| <b>3.6.3</b> | <b>Lifecycle/control operation signals in the control interface.....</b>             | <b>19</b> |
| <b>3.6.4</b> | <b>Property accesses.....</b>                                                        | <b>23</b> |
| <b>3.6.5</b> | <b>Property Data Types.....</b>                                                      | <b>24</b> |
| <b>3.6.6</b> | <b>Raw access to properties.....</b>                                                 | <b>24</b> |
| <b>3.6.7</b> | <b>Summary of HDL Worker Control Interface.....</b>                                  | <b>26</b> |
| <b>3.7</b>   | <b>HDL Worker Data Interfaces for OCS Data Ports.....</b>                            | <b>27</b> |
| <b>3.7.1</b> | <b>Message Payloads vs. Physical Data Width on Data Interfaces.....</b>              | <b>27</b> |
| <b>3.7.2</b> | <b>Byte Enables on Data Interfaces.....</b>                                          | <b>28</b> |
| <b>3.7.3</b> | <b>Streaming Data Interfaces to/from the HDL Worker.....</b>                         | <b>29</b> |
| <b>3.7.4</b> | <b>Message Data Interfaces.....</b>                                                  | <b>36</b> |
| <b>3.7.5</b> | <b>HDL Worker Data Interface Summary.....</b>                                        | <b>36</b> |
| <b>3.8</b>   | <b>Time Service Interfaces.....</b>                                                  | <b>37</b> |
| <b>3.9</b>   | <b>MemoryService Interfaces.....</b>                                                 | <b>38</b> |
| <b>4</b>     | <b>Building HDL Assets.....</b>                                                      | <b>39</b> |
| <b>4.1</b>   | <b>HDL Build Targets.....</b>                                                        | <b>39</b> |
| <b>4.2</b>   | <b>The HDL Build Hierarchy.....</b>                                                  | <b>41</b> |
| <b>4.3</b>   | <b>HDL Directory Structure.....</b>                                                  | <b>45</b> |
| <b>4.4</b>   | <b>HDL Search Paths when Building.....</b>                                           | <b>47</b> |
| <b>4.4.1</b> | <b>Searching for HDL Primitives.....</b>                                             | <b>47</b> |
| <b>4.4.2</b> | <b>Searching for XML files (OCS, OPS) when building workers.....</b>                 | <b>48</b> |
| <b>4.4.3</b> | <b>Searching for Workers in Component Libraries.....</b>                             | <b>48</b> |
| <b>5</b>     | <b>HDL Primitives [From here on this document is not updated].....</b>               | <b>50</b> |
| <b>5.1</b>   | <b>HDL Primitive Libraries.....</b>                                                  | <b>51</b> |
| <b>5.2</b>   | <b>HDL Primitive Cores.....</b>                                                      | <b>52</b> |
| <b>6</b>     | <b>HDL Assemblies for Creating Bitstreams/Executables.....</b>                       | <b>53</b> |
| <b>6.1</b>   | <b>The HdlAssembly XML file that describes the assembly.....</b>                     | <b>54</b> |
| <b>6.2</b>   | <b>The Makefile for building an assembly.....</b>                                    | <b>57</b> |
| <b>6.3</b>   | <b>Hdl Containers in an XML file.....</b>                                            | <b>59</b> |
| <b>6.3.1</b> | <b>Service connections in a container.....</b>                                       | <b>59</b> |

|          |                                                                                      |           |
|----------|--------------------------------------------------------------------------------------|-----------|
| 6.3.2    | Preparing the OpenCPI runtime metadata for the bitstream.....                        | 60        |
| 6.3.3    | HDL device naming.....                                                               | 61        |
| <b>7</b> | <b>HDL Simulation Platforms.....</b>                                                 | <b>63</b> |
| <b>8</b> | <b>The ocpihdl command-line utility for HDL development.....</b>                     | <b>64</b> |
| 8.1      | admin command – Print the device’s administrative information.....                   | 66        |
| 8.2      | bram command – Create a configuration BRAM file from an XML file.....                | 67        |
| 8.3      | deltatime command – Perform time synchronization test on device.....                 | 68        |
| 8.4      | dump command – unimplemented.....                                                    | 69        |
| 8.5      | emulate command – emulate a network-based device (admin space only).....             | 70        |
| 8.6      | ethers command – display available (up and connected) network interfaces.....        | 71        |
| 8.7      | probe command – test existence and availability of device.....                       | 72        |
| 8.8      | load command – load bitstream.....                                                   | 73        |
| 8.9      | getxml command – retrieve XML metadata from device.....                              | 74        |
| 8.10     | radmin command – read a specific address in device’s admin space.....                | 75        |
| 8.11     | reset command – perform soft reset on device.....                                    | 76        |
| 8.12     | rmeta command – read specific addresses in the metadata space of the device.....     | 77        |
| 8.13     | search command – search for all available HDL devices.....                           | 78        |
| 8.14     | settime command – set device’s time from system time.....                            | 79        |
| 8.15     | simulate command – run a simulation server for a specific simulator “platform” ..... | 80        |
| 8.16     | unbram command – create an XML file from a config BRAM file.....                     | 82        |
| 8.17     | wadmin command – write specific addresses in the device’s admin space.....           | 83        |
| 8.18     | <b>Worker Commands: commands that operating on individual workers.....</b>           | <b>84</b> |
| 8.18.1   | wclear command – clear a worker’s error registers.....                               | 84        |
| 8.18.2   | wdump command – dump a worker’s status registers (not properties).....               | 84        |
| 8.18.3   | wop command – perform a control operation on a worker.....                           | 84        |
| 8.18.4   | wread command – read a worker’s configuration property space.....                    | 84        |
| 8.18.5   | wreset command – assert reset for a worker.....                                      | 85        |
| 8.18.6   | wunreset command – deassert reset for a worker.....                                  | 85        |
| 8.18.7   | wwctl command – write a worker’s control register.....                               | 85        |
| 8.18.8   | wwpage command – write a worker’s window/page register.....                          | 85        |
| 8.18.9   | wwrite command – write a worker’s configuration property space.....                  | 85        |
| <b>9</b> | <b>HDL Platform and Device development.....</b>                                      | <b>86</b> |
| 9.1      | Outputs from ocpigen.....                                                            | 87        |
| 9.2      | How ocpigen is used.....                                                             | 88        |

## 1 References

This document depends on several others. Primarily, it depends on the ***OpenCPI Component Development Guide (CDG)***, which describes concepts and definitions common to all OpenCPI authoring models.

*Table 1: References to Related Documents*

| Title                               | Published By            | Link                                                                                                                                                                                             |
|-------------------------------------|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| OpenCPI Technical Summary           | <a href="#">OpenCPI</a> | Public URL:<br><a href="http://www.opencpi.org/doc/ts.pdf">http://www.opencpi.org/doc/ts.pdf</a>                                                                                                 |
| OpenCPI Component Development Guide | <a href="#">OpenCPI</a> | Public URL:<br><a href="http://www.opencpi.org/doc/pdf/OpenCPI_Component_Development.pdf">http://www.opencpi.org/doc/pdf/OpenCPI_Component_Development.pdf</a>                                   |
| OpenCPI RCC Development Guide       | <a href="#">OpenCPI</a> | Public URL:<br><a href="https://github.com/opencpi/opencpi/raw/master/doc/pdf/OpenCPI_RCC_Development.pdf">https://github.com/opencpi/opencpi/raw/master/doc/pdf/OpenCPI_RCC_Development.pdf</a> |

## 2 Overview

This document describes how to develop component implementations, known as **HDL workers**, for FPGAs, using the **Hardware Description Language (HDL) Authoring Model**. Workers are typically created in a component library, so that they are available for OpenCPI application developers and users.

Some knowledge of FPGA terminology is assumed here, but this document is also useful for non-FPGA developers in understanding the OpenCPI FPGA development process.

This document builds on the information provided in the **OpenCPI Component Development Guide (CDG)**, which introduces concepts and processes used for OpenCPI component development in general.

In addition to describing how to develop **HDL workers**, this document also describes:

- how to create **HDL primitive libraries**, which are libraries of smaller/simpler reusable code modules sometimes used to build HDL workers
- how to create **HDL primitive cores**, which are prebuilt and possibly pre-synthesized modules sometimes incorporated into HDL workers
- how to assemble a group of connected HDL workers to form an **HDL assembly**, which is realized in a complete FPGA bitstream.

An HDL assembly provides a **subassembly** of workers that is available to implement a subset of an overall component-based application.

For HDL development, the OpenCPI CDK also relies on technology-specific FPGA synthesis and simulation tools (e.g. Xilinx XST and Isim, Altera Quartus, Modelsim etc.).

Several key concepts are described in the following sections, followed by the development process for creating HDL workers, primitives, assemblies and bitstreams to support the execution of parts of component-based applications on FPGAs. All these terms are prefixed with HDL here to avoid confusion when they are used elsewhere.

**HDL Authoring Model:** the OpenCPI authoring model targeting Hardware Description Languages that are appropriate for FPGA development, currently using VHDL, with some support for Verilog. System-Verilog is another suitable HDL, although note currently supported.

**HDL Worker:** an HDL implementation of a component specification, with the source code written according to HDL authoring model. It is common and recommended to have an RCC worker that also implements the same spec.

**HDL Primitive:** an HDL assets that is lower level than workers, that is used as a building block for workers. HdI primitives can either be **libraries** or **cores**.

**HDL Primitive Library:** a collection of low level modules compiled from source code that can be references in HDL worker code. An HDL worker declares which HDL primitive libraries it draws modules from.

**HDL Primitive Core:** a single low level module that may be built and/or synthesized from source code, or imported as presynthesized and possibly encrypted, or generated by tools like Xilinx CoreGen or Altera MegaWizard. An HDL worker declares which primitive cores it requires (and instantiates).

**HDL Assembly:** a composition of connected HDL workers that are built into a complete FPGA configuration **bitstream**, acting as an OpenCPI artifact. An HDL assembly is used to implement a subset of the components in an OpenCPI application — i.e. as a *subassembly* of one or more application assemblies.

**HDL Platform:** an OpenCPI platform based on an FPGA that is enabled to host OpenCPI HDL workers. Simulators are also considered HDL platforms.

**HDL Platform Configuration:** a prebuilt (presynthesized) assembly of device-level HDL workers that represent a particular configuration of device support modules for a given HDL platform.

**HDL Container:** a complete design for an entire FPGA, which includes an HDL assembly and an HDL platform configuration combined in a specified fashion.

**HDL Target:** a particular type of FPGA device, usually what is considered a *part family*, that is the target of compilation or synthesis, where the result can be used for any architecturally similar device. Examples are “virtex6”, or “stratix4”, or “zynq”. Simulators are also HDL targets.

HDL development in OpenCPI includes both *application workers* in a component library, which perform functions independent of any specific hardware attached to the FPGA, as well as *device workers* that are designed to support specific external hardware such as ADCs, flash memories, I/O devices, etc. HDL device workers are developed as part of enabling an HDL platform (an FPGA on a particular board) for OpenCPI. The development of device workers as well as other aspects of enabling new HDL platforms for OpenCPI is described in a separate **OpenCPI Platform Development** document.

The sections below for HDL/FPGA development are:

- Developing application workers in a component library
- The HDL Build Process: building for different target devices and platforms
- The HDL Build Hierarchy: how whole device “bitstreams” are created.
- Developing assemblies of workers on FPGAs.

### 3 HDL Workers

This section describes how to write an HDL worker, and defines what distinguishes HDL worker development from developing workers using other authoring models. It builds on the worker development section in the Component Development Guide (CDG) that describes what is common to building workers for *all* authoring models.

HDL workers can be single standalone source code modules. They can also reference and instantiate lower level models: HDL primitive libraries and HDL primitive cores. In either case a worker is compiled and (when appropriate) synthesized for a given **HDL target**. HDL targets are described in detail in the HDL Build Process section below, but are generally FPGA chip families.

The process of writing a new HDL worker (after the OCS exists), starts with using the `ocpidev create worker` command, as described in the CDG, to create the worker's directory and its initial content, usually as the subdirectory of a component library. This command can be executed in a project's directory, a component library's directory or even a completely separate directory not part of any library or project. The name of an HDL worker always has the `.hdl` suffix, and the language must be specified as `VHDL` or `Verilog`. Languages are case insensitive. The default is `vhdl`. Here are some examples of using ocpidev to create HDL workers:

```
ocpidev create worker xyz.hdl -L vhdl
ocpidev -v -l dsplib create worker
ocpidev -S fft2d-spec create worker fft.hdl
```

As described in the CDG, this will create an initial worker description XML file (OWD), `<worker>.xml`, in the worker directory, which is subsequently edited for worker-specific attributes. Similarly, a Makefile is also created in the worker's directory, which may also be edited to enter worker-specific non-default build options. Creating an HDL worker also generates various files in the worker's gen subdirectory, including the code skeleton file which is initially copied to the `<worker>.language-suffix` in the worker's directory. The first ocpidev command above would result in the following directory tree for the worker:

```
xyz.hdl/
    Makefile          # Makefile for this worker
    xyz.xml          # OWD for this worker
    xyz.vhd          # editable source code for this worker
    gen/xyz-skel.vhd # initial skeleton for this worker
    xyz-defs.vhd    # definitions enabling instantiation
    xyz-impl.vhd    # the generated shell of the worker
```

None of the files in the `gen` subdirectory should be edited, and all are deleted when the `make clean` command is issued here or at the project or library level.

### 3.1 Building the worker

The above command establishes the worker's directory (`xyz.hdl`), its `Makefile`, its Worker description file (`xyz.xml`), and its initial source code file (`xyz.vhd`). After creating the worker, before writing any code, it is recommended to compile the skeleton worker that is automatically generated. This can be done by issuing the

```
make
```

command in the worker's directory, or the

```
make xyz.hdl
```

in the library directory containing the worker's directory., assuming the worker was created in a library. These commands will compile (or synthesize) the worker for the currently active set of **HDL targets**, such as `virtex6`, `stratix4`, or `modelsim`. HDL targets are described in detail in the [Building HDL Assets](#) section.

There should be no errors unless the specs file (OCS) somehow used an unsupported feature of the tools. As with any type of worker, compilation output is placed in the `target-TTT` subdirectory of the worker, for each target in the currently active set. An example explicitly specifying the targets is:

```
make HdlTargets="modelsim virtex6"
```

This would compile (and for non-simulation targets, synthesize) the worker for two targets.

For HDL workers using VHDL, the initial worker code file (`xyz.vhd`) that you *do* edit contains only the `architecture` of the worker. The `entity` declaration of the worker was automatically generated for you, and is found in the generated file `gen/xyz-impl.vhd`. So at this point, after building the initial generated worker code file, the directory representing the new worker looks like this:

```
xyz.hdl/
    Makefile          # Makefile for this worker
    xyz.xml          # OWD for this worker
    xyz.vhd          # editable source code for this worker
    gen/xyz-skel.vhd # initial skeleton for this worker
        xyz-defs.vhd # definitions enabling instantiation
        xyz-impl.vhd # the generated shell of the worker
    target-virtex6/
        xyz.ngc       # virtex6 synthesized core file
        xyz-xst.out   # log of tool output
    target-modelsim/
        xyz/*         # modelsim compilation result files
        xyz-modelsim.out # log of tool output
```

The `gen` directory and all the `target-*` subdirectories are generated, and then removed when the `make clean` command is issued. They should not be edited. More details about these files are described below.

### 3.2 Worker Makefile

The worker Makefile is usually not hand-edited, but there are cases where editing is necessary. Several variables are available for all authoring models, as described in the CDG. If your HDL worker will consist of multiple source files, you can add VHDL or Verilog source files (beyond the main file whose name is the worker name) by specifying the `SourceFiles` variable in the Makefile. Even if the primary language of the worker is VHDL, other Verilog files can be used, and vice versa. In addition to the `make` variables mentioned in the CDG for all authoring models, the HDL worker `Makefile` may also include these additional ones:

*Table 2: HDL Worker Makefile Variables*

| Variable Name<br>in HDL Worker Makefile | As default in<br>library Makefile? |                                                                                       |
|-----------------------------------------|------------------------------------|---------------------------------------------------------------------------------------|
| <code>Cores</code>                      | N                                  | A list of HDL primitive cores built elsewhere                                         |
| <code>VerilogIncludeDirs</code>         | Y                                  | Searchable directories for Verilog include files, in addition to the worker directory |

For HDL workers, the `Libraries` variable described in the CDG refers to HDL primitive libraries, described in the [\*HDL Primitive Libraries\*](#) section.

When using the `Cores` and `Libraries` variables, if a name in the list has no slashes, it is found by searching the path for HDL primitives as described in the [\*HDL Search Paths\*](#) section. If the name *does* contain slashes it is the specific pathname of the core or library, usually a relative pathname within the same project.

The variables with Y in the table above are those than can be specified in the component library's `Makefile` to provide a default for all workers in the library.

### **3.3 HDL Worker Description File: the OWD XML file.**

This file specifies characteristics for this implementation of the spec file (OCS). In many cases it can be empty or left as it is. A default version is generated when the worker is created. Aspects of the OWD that are common to all authoring models are described in the CDG. Aspects specific to the HDL authoring model are described here.

The primary reasons to customize this OWD XML file for an HDL worker are:

- **Add implementation-specific properties**

You can add additional worker properties for the implementation, beyond what is in the spec file. The `property` elements accomplish this, with the same xml format as in the spec file.

- **Add more accessibility to spec properties**

You can add more access capabilities for existing properties, via the `specproperty` element. E.g. make a property that is write-only in the spec to be also readable in the implementation for debug purposes. The only attributes are `name` and the access properties: `readable`, `writable`, `volatile`, `initial`.

- **Specify interface style and implementation attributes for data ports**

You can specify whether the port uses a stream or message interface, and provide additional details for those interfaces (e.g. data path width, or whether the port supports aborting messages).

- **Specify which control operations that this worker will implement**

You can specify which control operations are in the implementation: none are required.

#### **3.3.1 HDLWorker Top Level Attributes**

The top level of the HDL OWD is the XML `HdLWorker` element, which can have the XML attributes in the table below (beyond those defined for all OWDs in the CDG). All are optional, and are only specified when the default behavior must be overridden.

*Table 3: HdIWorker Attributes*

| HdIWorker Attribute Name         | Data Type       | Description                                                                                                                                                                                                                                                                                                                                                                                                                           |
|----------------------------------|-----------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <b>Pattern</b>                   | string          | An external signal naming pattern (described below this table) for all signals of the worker. The default is "%s_”, which indicates a prefix of the port name followed by underscore. <b>External</b> signals are those defined using the OCPIP interface standard, not the inner <b>worker</b> signals.                                                                                                                              |
| <b>PortPattern</b>               | string          | A port naming pattern used when port names and signal (not data) direction are used in the generated code. I.e. for each worker port, a naming pattern is defined both for input signals and output signals of the port. The default is "%s_%n”, which indicates a prefix of the port name followed by underscore, and then <b>in</b> or <b>out</b> for signals that are <i>input</i> to the worker or <i>output</i> from the worker. |
| <b>DataWidth</b>                 | unsigned        | The default physical width of data ports for this worker. Any individual port can override this. The default value when this attribute is not specified is based on the protocol (messages) defined for the por in the OCSt.                                                                                                                                                                                                          |
| <b>SizeOfConfig-Space</b>        | unsigned 64-bit | Overrides the size of the configuration space in bytes. The default is based on the actual properties.                                                                                                                                                                                                                                                                                                                                |
| <b>Sub32BitConfig-Properties</b> | boolean         | Whether this worker needs to address items smaller than 32 bits (and thus requires byte enables in its interface). The default is based on the actual properties.                                                                                                                                                                                                                                                                     |
| <b>RawProperties</b>             | boolean         | A boolean value indicating whether the implementation will use the <b>raw</b> property interface for all properties. The default is <b>false</b> , unless the <b>FirstRawProperty</b> attribute is set. The <b>raw</b> interface is described below under <a href="#">Raw access to properties</a> and is typically only used for device workers.                                                                                     |
| <b>FirstRawProperty</b>          | string          | A string value indicating the name of the first property that requires the <b>raw</b> property interface. Properties before this use the normal property interface.                                                                                                                                                                                                                                                                   |
| <b>Outer</b>                     | boolean         | Whether the worker implements the outer interface, used in special circumstances.                                                                                                                                                                                                                                                                                                                                                     |

### 3.3.2 *Attributes of Worker Interfaces*

Other aspects of the OWD for HDL workers are in the sections describing each interface: the control interfaces, data interfaces (stream and message), and service interfaces (time and memory). They are summarized here:

Any data interface may have these attributes to override protocol-determined values and to define clocks associated with them:

- MyClock
- Clock
- DataValueSize
- DataValueGranularity
- NumberOfOpCodes
- MaxMessageValues
- ZeroLengthMessages

The **streaminterface** element has these attributes:

- Abortable
- PreciseBurst

The **timeinterface** element has these attributes:

- SecondsWidth
- FractionWidth
- AllowUnavailable

### 3.3.3 Clocking Elements and Attributes

The default clocking configuration of HDL workers is that all the interfaces use the clock that is provided in the control interface. The control interface clock is always present.

All other interfaces can have a **MyClock** boolean attribute set to true, which indicates that the interface has its own clock which has no relationship with the control clock. I.e., its clock is in a different clock domain. In this case a **clk** input signal is included in its interface.

Interfaces may also have a **clock** string attribute which identifies a specific clock to use for this interface, which may be shared with other interfaces. The string value of this attribute is either the name of a OCS port or the name of a separately defined clock (described just below). In the former case, the meaning is “I'll have what she's having”: the clock for this interface should be the same as the clock used at the other, identified interface.

A clock may be defined separate from any interface, and then referred to using the **clock** attribute of those interfaces. A separately defined clock may also be used independent of any interface. These separately defined clocks are needed when:

- A clock does not really belong to any interface, but may still be used by them.
- A clock has non-default attributes.

Separate clocks are defined with the **clock** XML element, as a child element of the top level **HdLWorker** element. The attributes of Clock elements are:

*Table 4: HDL Worker Clock Element Attributes*

| Attribute Name | Value Type | Description                                                                                                                                                         |
|----------------|------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <b>Name</b>    | string     | The name used to identify this clock in the <b>clock</b> attribute of other interfaces. Also used in forming the input signal name for this clock into this worker. |
| <b>Signal</b>  | string     | The actual signal name that should be used for this clock. The default is the clock name, in the <b>name</b> attribute.                                             |

### 3.4 The Authored Worker: the VHDL “architecture” or Verilog “module body”

The “functional code” or “business logic” of the worker is in the architecture section (VHDL), in the `xyz.vhd` file (and possibly subsidiary source files). In Verilog, it is in the body of the module, and the file is `foo.v`. The entity/module being implemented at this level starts life as the generated skeleton of the **inner** worker, meaning it is surrounded by an automatically generated logic **shell** to provide robust and composable interfaces compliant with the Open Core Protocol interfaces defined for the **outside** of all workers. Thus an **outer** shell is generated around this **inner** worker. This shell and the entity declaration for the inner worker are found in the generated file: `gen/xyz-impl.vhd` file for VHDL and `gen/xyz-impl.vh` for Verilog. The skeleton file consisting of an empty inner worker, becomes the authored worker when the functional logic is written/edited into that file.



Figure 1: Worker Code and Files

All OpenCPI HDL workers are characterized by their properties, their ports and their clocks, and usually the clocks are simply associated with ports, or even more simply, a single clock is commonly used with all ports. The job of implementing the inner worker is the job of processing the various ports’ inputs to the worker to produce the various ports’ outputs of the worker. For each port of the worker (including the control port) there are input signals (into the worker) and output signals (out of the worker).

In VHDL, these groups of inputs and outputs are in a record type. Thus for each port (whether control, data producing, data consuming or other service), there is an ***input signal*** record and an ***output signal*** record, named <port>\_in and <port>\_out, respectively. In Verilog there are no record types, so individual signals simply have the <port>\_in\_ and <port>\_out\_ prefixes.

E.g., with a “filter” worker that has a “sensor” input port, and a “result” output port, the VHDL entity declaration would be:

```
entity filter_worker is
  port(
    ctl_in          : in  worker_ctl_in_t;
    ctl_out         : out worker_ctl_out_t;
    sensor_in       : in  worker_sensor_in_t;
    sensor_out      : out worker_sensor_out_t;
    result_in       : in  worker_result_in_t;
    result_out      : out worker_result_out_t);
end entity filter_worker;
```

The actual individual signals in each record depend on the contents of the spec XML file (OCS) and the implementation XML file (OWD). These signals will be described below. Note that the name of the “control port” defaults to “ctl”. An example skeleton file for this worker would be:

```
library ieee; use ieee.std_logic_1164.all, ieee.numeric_std.all;
library ocpi; use ocpi.types.all;
architecture rtl of filter_worker is
begin
  -- put the logic for this worker here
end entity filter_worker;
```

Note that while the overall worker has the name “filter”, the entity being implemented in the architecture here is “filter\_worker”, the *inner* worker.

### 3.5 Signal naming conventions and data types

Other than the property access signals described below, the signals in these interfaces are mostly a combination of `std_logic_vector` and a boolean type, `bool_t` that is used for various boolean indicator signals.

This VHDL type `bool_t`, acts as much like the VHDL type `BOOLEAN` as possible (with various operator overloading functions), while still being based on `std_logic`. The `to_boolean` and `to_bool` functions explicitly converts to and from the VHDL `BOOLEAN` type, respectively. The `its` function is a convenient synonym for the `to_boolean` function, enabling code like:

```
if its(ready) then
  ...
end if;
```

There are also two constants for this type, `btrue` and `bfalse`. All VHDL types defined are in the `types` package of the `ocpi` library.

All signals into and out of the authored worker are in the `in` and `out` records of each port.

All data types created by OpenCPI use the `_t` suffix. All enumeration values defined by OpenCPI use the `_e` suffix.

OpenCPI uses the term **port** to mean a high level data flow interface in and out of all types of workers. This conflicts with the use of the term in VHDL and Verilog, which means the individual signals (of any type) that are the inputs and outputs of an entity (VHDL) or module (Verilog).

In this section on HDL workers, this document uses the term **interface** to be the HDL worker's set of input and output port signals that corresponds to the high level OpenCPI ports as defined in the OCS and OWD for the HDL worker. We also use the term **interface** for the *implicit* control port of all workers. An HDL worker has a control interface (for the implicit control port), data interfaces (for the explicit data ports defined in the OCS), and service interfaces (for server ports as defined in the HDL worker's OWD).

### **3.6 The Control Interface to the HDL worker.**

Every HDL worker has a control interface that performs three functions:

- Provide a *control* clock and associated reset
- Convey life cycle *control operations* like `initialize`, `start` and `stop`.
- Access the worker's *configuration properties* as specified in the OCS and OWD.

In VHDL, when the default name of the control interface is used (`ctl`), the input signals are prefixed with `ctl_in`. and the output signals are prefixed with `ctl_out`. i.e. the input signals are in the `ctl_in` record port, and the output signals are in the `ctl_out` record port.

#### *3.6.1 Reset behavior and initializations in OpenCPI workers and infrastructure*

Per (at least) Xilinx recommended practice, OpenCPI uses and generates resets synchronously. Thus resets are asserted and deasserted synchronously. There are several reasons for this policy, but one is that less logic is typically needed to implement the resetting of register state.

At power up or reconfiguration, resets are asserted, so they will be asserted on the *first* clock edge. Per the OCPIP specification, resets will always be asserted for at least 16 clock cycles.

If registers (state) truly need an initial value (e.g. for simulation cleanliness or glitch-free initialization or sim-vs-synth consistency), it is preferred to set an initial *default expression* value in VHDL or Verilog, rather than using asynchronous reset. Note that current Xilinx (XST, and ISIM), Altera (Quartus), and Mentor (Modelsim) support such initialization without using any resources.

In general, applying resets to register state should serve a functional purpose, and not simply be considered the default practice for all state.

#### *3.6.2 Clock and reset input signals in the control interface*

The signal `clk` is the clock for all other control port signals as well as the *default* clock for all other data or service ports of the worker. The `reset` signal (asserted high), is asserted and deasserted synchronously with this clock. The `reset` is guaranteed to be asserted for 16 clock cycles. When 16 clocks are not enough to perform initialization, the worker should implement the `initialize` control operation (see below). The control reset, like all other resets generated by OpenCPI infrastructure, is initially asserted.

If the worker (in its OWD) declares that other data or service ports have clocks that are different than this control clock (i.e. those interfaces operate in *different clock domains*), the worker implementation code has responsibility for the appropriate synchronizations between this control clock (and its associated signals), and any other signals related to the data or service interfaces. In particular, it is the worker's responsibility to propagate this control reset to the reset outputs associated with each other interfaces, in their clock domain.

### 3.6.3 Lifecycle/control operation signals in the control interface

Other than the control `reset` signal, the lifecycle of all workers is managed by life cycle control operations, according to the diagram below. However, in simple and common cases, when the worker has no need to implement any of these operations, there is a single input signal indicating when the worker should *operate*, called `is_operating`. I.e., after `reset` is deasserted, the worker should *operate* only when this `is_operating` signal is asserted. Many HDL workers use this signal (as well as `clk` and `reset`) and no others in the control interface. The `is_operating` signal is a convenience indicating that the worker has been *started* and should now “do its thing”. A *worker must not perform any data transactions at its data ports unless `is_operating` is true*. An example of a typical and simple HDL worker would be:

```
architecture rtl of filter_worker is
    signal mystate_r : std_logic_vector(7 downto 0); -- some state
begin
    process (ctl_in.clk) is
    begin
        if rising_edge(ctl_in.clk) then
            if its(ctl_in.reset) then
                mystate <= "01010101";
            elsif its(ctl_in.is_operating) then
                -- do the clocked functions of this worker
            end if;
        end if;
    end process;
end rtl;
```

This allows the worker to be suspended and resumed since nothing happens during suspension, when `is_operating` is false.

If the `initialize` control operation is not implemented, then when the `reset` signal is deasserted, the worker is considered to be *initialized*. If `initialize` is implemented, the worker is considered in the `exists` state after `reset` is deasserted. The worker's OWD specifies whether `initialize` is implemented by this worker.



Figure 2: Control Operations and States

When a worker needs to explicitly support other control operations, there are 2 input and 2 output signals it may use. The `control_op` signal is a VHDL enumeration value that conveys which control operation is in progress. When there is no operation in progress, it has a value of `no_op_e`. Otherwise the choices are: `INITIALIZE_e`, `START_e`, `STOP_e`, `RELEASE_e`, `BEFORE_QUERY_e`, and `AFTER_CONFIG_e`. The operation is terminated by the worker asserting the `done` or `error` output signals, after which the control operation is considered accomplished successfully (if `done`) or not (if `error`). Note that the `done` signal is driven to a default value of `btrue` in the entity declaration (and the `error` signal defaults to `bfalse`). The worker does not need to drive these at all if it will always perform the control operations in a single cycle and will never need to assert `error`.

A common example of a control operation might be when the worker needs multiple clock cycles to accomplish something like `initialize` or `start`. In that case it notices when the `control_op` signal changes from `no_op_e`, and then performs the operation, asserting `done` (or `error`) when the operation has completed. An example where initialization takes 20 clock cycles, would be:

```

architecture rtl of example_worker is
    signal init_count_r : unsigned(4 downto 0);
begin
    process (ctl_in.clk) is
    begin
        if rising_edge(ctl_in.clk) then
            if its(ctl_in.reset) then
                init_count <= (others => '0');
            elsif ctl_in.control_op = INITIALIZE_e then
                init_count <= init_count + 1;
            elsif its(ctl_in.is_operating) then
                -- do normal functions
            end if;
        end if;
    end process;
    ctl_out.done <= -- initialize takes 20 clocks, all others take 1
        '0' when ctl_in.control_op = INITIALIZE_e and init_count < 20
        else '1';
end rtl;

```

Another convenience input signal, `state`, indicates which lifecycle state the worker is in. It changes when control operations succeed. It is a VHDL enumeration value: `EXISTS_e`, `INITIALIZED_e`, `OPERATING_e`, `SUSPENDED_e`, `FINISHED_e`, and `UNUSABLE_e`.

Finally, there are two control output signals that the worker can use to indicate two other conditions. The first, `attention`, allows the worker to indicate an interrupt or other condition to control software. There is no particular definition of what this means, and its usage generally requires software to monitor it, and perhaps access some volatile property values to determine what caused it. When or how it is deasserted is also implementation-dependent, although it should be deasserted on `reset`.

The second control output signal is `finished`. The worker uses this to indicate it has entered the `finished` state, and will perform no further work. This enables the worker to tell control software that its work is finished and perhaps that the application the worker is part of can be considered finished. This signal should be deasserted upon `reset`. Asserting `finished` will cause `is_operating` to become false, and `state` to become `FINISHED_e`.

Here is a summary of the control interface signals in the `ctl_in` record:

*Table 5: Control Input Signals*

| Signal                        | Type                      | Description                                                                                                                                                                                                                                                                  |
|-------------------------------|---------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>clk</code>              | <code>std_logic</code>    | The clock for the control interface and the default clock for all other interfaces and ports.                                                                                                                                                                                |
| <code>reset</code>            | <code>bool_t</code>       | Asserted high and synchronously, for the control interface. Initially asserted.                                                                                                                                                                                              |
| <code>control_op</code>       | <code>control_op_t</code> | An enumeration type specifying the currently active control operation, with the value <code>no_op_e</code> when there is no active control operation. Control operations end when <code>done</code> or <code>error</code> signal in the <code>ctl_out</code> record is true. |
| <code>state</code>            | <code>state_t</code>      | An enumeration type indicating the worker's current control state.                                                                                                                                                                                                           |
| <code>is_operating</code>     | <code>bool_t</code>       | An indication that the worker has been started and is thus in an operating state, e.g. when <code>state = operating_e</code>                                                                                                                                                 |
| <code>abort_control_op</code> | <code>bool_t</code>       | A command indicating that a long-duration control operation is being forcibly aborted.                                                                                                                                                                                       |
| <code>is_big_endian</code>    | <code>bool_t</code>       | For dynamic endian workers                                                                                                                                                                                                                                                   |

Here is a summary of the control interface signals in the `ctl_out` record:

*Table 6: Control Output Signals*

| Signal                 | Type                | Description                                                                                                                                                              |
|------------------------|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>done</code>      | <code>bool_t</code> | The signal indicating the successful end of a control operation. The default value is true indicating that all control operations complete in the same cycle they start. |
| <code>error</code>     | <code>bool_t</code> | The signal indicating the unsuccessful end of a control operation. Default value is false.                                                                               |
| <code>finished</code>  | <code>bool_t</code> | A persistent indication, not deasserted after being asserted, until reset, that the worker has entered the <code>finished</code> state. Default is false.                |
| <code>attention</code> | <code>bool_t</code> | A persistent indication of an attention condition, deasserted by reset or some worker-specific action, typically a write to a property.                                  |

### 3.6.4 Property accesses

A worker's configuration properties are accessed via two additional record port signals, called `props_in` and `props_out` (separate from the `ctl_in` and `ctl_out` records for the control interface). The individual signals within these records depend on what types of properties have been declared in the OCS and OWD. In the tables below, for a property called `foo`, the signals will be present as described. The signals possibly present in the `props_in` record are in the following table:

*Table 7: HDL Worker Property Input Signals*

| Signal in <code>props_in</code> | Included when:                             | Type                 | Signal Description                                                                                                      |
|---------------------------------|--------------------------------------------|----------------------|-------------------------------------------------------------------------------------------------------------------------|
| <code>foo</code>                | Writable or Initial                        | *                    | The registered value last written by control software. The type is dependent on the property type.                      |
| <code>foo_length</code>         | Writable or Initial: and sequence          | <code>ulong_t</code> | The registered 32 bit unsigned ( <code>ulong_t</code> ) number of elements in the sequence when property is a sequence. |
| <code>foo_written</code>        | Writable                                   | <code>bool_t</code>  | Indication that the entire value has been written.                                                                      |
| <code>foo_any_written</code>    | Writable and (array or sequence or string) | <code>bool_t</code>  | Indication that any part of the value is being written.                                                                 |
| <code>foo_read</code>           | Volatile or (readable and not writable)    | <code>bool_t</code>  | Indication that the property is being read.                                                                             |

The *indication* signals are valid during the access operation (until `done` or `error`). Any `writable` property is registered in the worker's `shell` when written, even when the property is `volatile` and the worker is supplying a volatile value for reading in the `props_out` record. The signals possibly present in the `props_out` record are:

*Table 8: HDL Worker Property Output Signals*

| Signals in <code>props_out</code> | Included when property is:              | Description                                                                                                    |
|-----------------------------------|-----------------------------------------|----------------------------------------------------------------------------------------------------------------|
| <code>foo</code>                  | Volatile or (readable and not writable) | The worker-supplied value of the property, with the type dependent on the property declaration.                |
| <code>foo_length</code>           | Volatile or (readable and not writable) | The worker-supplied ( <code>ulong_t</code> ) length (number of elements in the sequence) when a sequence type. |

### 3.6.5 Property Data Types

The VHDL data type of the property value signals (in `props_in` or `props_out`) is the type as specified in the OCS XML definition of the property with the suffix `_t`. All signed and unsigned integer types are based on the VHDL IEEE `numeric` package `signed` and `unsigned` types. I.e. `uchar_t` is the IEEE unsigned type of 8 bits. Also, a conversion function is defined to convert the VHDL `integer` and `natural` types into the numeric property types. I.e. `to_uchar` converts a VHDL `natural` to a `uchar_t`. These types are `char_t`, `uchar_t`, `short_t`, `ushort_t`, `long_t`, `ulong_t`, `longlong_t`, `ulonglong_t`. For the `char_t` type, which is an 8 bit IEEE `signed` type, there are also conversion functions (`to_char`) defined to convert from the VHDL `CHARACTER` type to `char_t`, and `to_character` to convert from `char_t` to the VHDL `CHARACTER` type.

The `float_t` and `double_t` types are just the appropriately sized `std_logic_vector`. There are conversion functions available to convert between these types and VHDL `REAL`. The `to_float` function converts from VHDL `REAL` to `float_t`, and the `from_float` function does the reverse, and similar functions exist for `double_t` although VHDL `REAL` cannot necessarily represent all `double_t` values.

The `string_t` type is a null-terminated array of `char_t` types. The `to_string` conversion function can convert from a VHDL `STRING` type to a `string_t`.

When a property type is an array or sequence, the value is a VHDL array of the appropriate type, named `<basetype>_array_t`, with a range of (0 to size-1).

For each property type there is a constant declared for the minimum and maximum values, named `<basetype>_min` and `<basetype>_max`, respectively.

All these types, conversion functions and constants are in the `ocpi.types` package (in the `ocpi` library). This package is available in all workers.

### 3.6.6 Raw access to properties

There is an alternative method for workers to support reading and writing of properties that is usually only used for device workers for devices with built-in hardware registers. Raw access is enabled by setting the `rawProperties` attribute or the `firstRawProperty` attribute in the OWD. If `rawProperties` is true, then all properties are `raw`. If `firstRawProperty` names a property, then properties before that one are accessed using the mechanism described above, but the named property and later properties are accessed using the `raw` interface.

The input signals (in the `props_in` record) for the raw property access interface (when required) are:

*Table 9: Raw Property Input Signals*

| Signal<br>in <code>props_in</code> | Signal included<br>when:                | Signal Description                                                                              |
|------------------------------------|-----------------------------------------|-------------------------------------------------------------------------------------------------|
| <code>raw.address</code>           | Always                                  | The byte offset from the first raw property, of the property being accessed.                    |
| <code>raw.byte_enable</code>       | Some raw property is less than 32 bits. | The (4) byte enables for reading/writing bytes within the 32 bit data of the control interface. |
| <code>raw.is_read</code>           | Some raw property is readable           | An access operation is reading a raw property.                                                  |
| <code>raw.is_write</code>          | Some raw property is writable           | An access operation is writing a raw property.                                                  |
| <code>raw.data</code>              | Some raw property is writable           | The data being written to the raw property, on the appropriate byte lanes for the offset.       |

The output signals (in the `props_out` record) for the raw property access interface are:

*Table 10: Raw Property Output Signals*

| Signal<br>in <code>props_out</code> | Signal included<br>when:      | Signal Description                                                                                                                                |
|-------------------------------------|-------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>raw.data</code>               | Some raw property is readable | The data value for the raw property being read, with values smaller than 32 bits (e.g. 8 or 16 bit values) aligned in the appropriate byte lanes. |
| <code>raw.done</code>               | Any raw properties            | Indicates when the access cycle has completed successfully.                                                                                       |
| <code>raw.error</code>              | Any raw properties            | Indicates when the access cycle has completed unsuccessfully.                                                                                     |

When raw properties are being accessed, the `props_out.raw.done` and `props_out.raw.error` control interface signals are used indicate when the access is complete (and for reading, when the `props_out.raw.data` signal is valid). These are analogous to the `done` and `error` signals in the `ctl_out` record port signal.

### 3.6.7 Summary of HDL Worker Control Interface

- Interface inputs are in the `ctl_in` signal port record.
- Interface outputs are in the `ctl_out` signal port record.
- Clock is `ctl_in.clk`.
- Reset is `ctl_in.reset`, asserted high, *synchronously*, for at least 16 cycles.
- Do no work unless `ctl_in.is_operating` is asserted, or a control operation is in progress.
- Optionally use `ctl_out.done` and `ctl_out.error` when control operations or property accesses take more than one cycle.
- Optionally set `ctl_out.finished`, if the worker has some semantic of being *finished*.
- Property inputs (written values, and access indicators) are in `props_in`.
- Property outputs (volatile values) are in `props_out`.

### 3.7 HDL Worker Data Interfaces for OCS Data Ports

Data interfaces convey messages with an associated **opcode**, which is an ordinal indicating the message type, among those defined in the protocol. When there is only one message type in the protocol, no opcode is used. Data interfaces also convey boundaries between messages and thus messages have a well defined length.

Data interfaces also implement flow control, such that an output cannot be produced unless permission is granted. HDL workers explicitly accept data at input interfaces and respect permission to provide data at output interfaces.

The fundamental nature of data flow in and out of HDL workers is messages with opcodes, under flow control.

- Message oriented, with explicit boundaries between messages
- Messages are tagged with an opcode, indicating the type of message
- Input data is offered to and accepted by the worker
- Output data is provided only when permission is granted.

A data interface can be zero width, when all messages in the protocol have no arguments and thus are all zero length. This means that message opcodes are all that is conveyed. If there is only one message in the protocol, and it has no arguments, there is no data, no opcode, but still an indication of a message being conveyed. This is essentially an “event pipe” or “pulse” interface.

#### 3.7.1 Message Payloads vs. Physical Data Width on Data Interfaces

A component port, as defined in the OCS, is normally associated with a protocol (OPS), which defines a set of message payloads that will be consumed or produced by all workers implementing that spec. Each message payload has a serialized format as a sequence of bytes that, when used in software, are laid out in byte-addressed memory. For an example, if the operation element in a protocol contains:

```
<argument name='a1' type='uchar' />
<argument name='a2' type='ushort' arraylength='2' />
<argument name='a3' type='ulonglong' />
```

And the values of this payload are:

```
a1: 1, a2: {0x2345,0x6789}, a3: 0xfedcba9876543210
```

Then the byte sequence (with proper alignment, and encoded little-endian), would be:

| Position       | 0  | 1 | 2      | 3      | 4  | 5  | 6 | 7 | 8  | 9  | 10 | 11 | 12 | 13                 | 14 | 15 |
|----------------|----|---|--------|--------|----|----|---|---|----|----|----|----|----|--------------------|----|----|
| Contents (hex) | 01 | x | 45     | 23     | 89 | 67 | x | x | 10 | 32 | 54 | 76 | 98 | ba                 | dc | fe |
| Argument       | a1 |   | a2[0]  | a2[1]  |    |    |   |   |    |    |    |    |    | a3                 |    |    |
| Value          | 1  |   | 0x2345 | 0x6789 |    |    |   |   |    |    |    |    |    | 0xfedcba9876543210 |    |    |

This layout and these values is the same for all types of workers in all (little endian) environments and over all data paths. The **x** values are padding for alignment.

HDL Worker data interfaces, associated with OCS ports, each have a *physical* width, as specified by the HDL-specific **DataWidth** attribute. This indicates the number of wires over which the message payloads will be conveyed. This width is constrained to be a multiple of the smallest data value in the protocol. In the example above this would be 8 bits. If the **DataWidth** was 8, then the sequence of content bytes shown above would be how this payload would appear on that byte-wide data interface. If the **DataWidth** was 16, the message would appear as:

| <b>Position</b> | <b>0</b> | <b>1</b> | <b>2</b> | <b>3</b> | <b>4</b> | <b>5</b> | <b>6</b> | <b>7</b> |
|-----------------|----------|----------|----------|----------|----------|----------|----------|----------|
| 15 downto 8     | x        | 23       | 67       | x        | 32       | 76       | ba       | fe       |
| 7 downto 0      | 01       | 45       | 89       | x        | 10       | 54       | 98       | dc       |

If the **DataWidth** was 32, the message would appear as:

| <b>Position</b> | <b>0</b> | <b>1</b> | <b>2</b> | <b>3</b> |
|-----------------|----------|----------|----------|----------|
| 31 downto 24    | 23       | x        | 76       | fe       |
| 23 downto 16    | 45       | x        | 54       | dc       |
| 15 downto 8     | x        | 67       | 32       | ba       |
| 7 downto 0      | 01       | 89       | 10       | 98       |

And if the **DataWidth** was 64, the message would appear as:

| <b>Position</b> | <b>0</b> | <b>1</b> |
|-----------------|----------|----------|
| 63 downto 56    | x        | fe       |
| 55 downto 48    | x        | dc       |
| 47 downto 40    | 67       | ba       |
| 39 downto 32    | 89       | 98       |
| 31 downto 24    | 23       | 76       |
| 23 downto 16    | 45       | 54       |
| 15 downto 8     | x        | 32       |
| 7 downto 0      | 01       | 10       |

The byte sequence remains the same regardless of **DataWidth**.

### 3.7.2 Byte Enables on Data Interfaces

Byte enables on data interfaces are only present when needed, and their presence is determined by a combination of the protocol and the **DataWidth** of the interface.

Two values are inferred from the protocol:

- **DataValueWidth**: the smallest data value in the protocol.
- **DataValueGranularity**: the least common multiple of data values among all messages in the protocol; all message lengths are a multiple of this number of data values.

The physical data width of the interface, **DataWidth**, must be a multiple of **DataValueWidth**. When **DataWidth** is greater than **DataValueWidth** \* **DataValueGranularity**, byte enables are in the interface. In this context, a byte is a data value, meaning bytes may not be 8 bits. Here are some examples:

- Message is a sequence of short (16 bit) values, **DataWidth** is 16:  
 DataValueWidth = 16  
 DataValueGranularity = 1  
 No byte enables required.
- Message is a sequence of short (16 bit) values, **DataWidth** is 32:  
 DataValueWidth = 16  
 DataValueGranularity = 1  
 Byte enables (2) are required since sequences might be an odd number of shorts.
- Message is a sequence of pairs of short (16 bit) values, **DataWidth** is 32:  
 DataValueWidth = 16  
 DataValueGranularity = 2  
 Byte enables not required since sequences are always a multiple of 2 shorts.

The exact naming and use of byte enable signals in data interfaces is described below.

### 3.7.3 Streaming Data Interfaces to/from the HDL Worker

Worker data ports (as specified in the OCS) can be implemented in two different styles. One is *stream* and the other is *message*. Stream interfaces are basically FIFOs with some in-band metadata. Message interfaces are based on addressable message buffers, and are described in the next section.

The default style of data interface is the stream interface, and, if all attributes of the implementation use defaults, no indication of this style choice is needed in the OWD. When data port attributes are required, this style is indicated by including a **<StreamInterface>** element in the OWD. Recall that the **DataWidth** attribute at the top level of the OWD specifies the default physical width of *all* data interfaces. If most interfaces have the same data width, it may be most convenient to specify this width as an attribute at the top level **HdlWorker** element.

An example of the per-data-port XML element is:

```
<StreamInterface name="sensor"
                 datawidth="64"
                 preciseBurst="true"/>
```

This would require that the OCS declare a data port named **sensor**.

Other than DataWidth, there are two options that can be specified for a stream data interface: *precise bursts* and *aborts*. None are used in most HDL workers, but they have important use cases.

**Precise bursts** are messages where the length of the message is available at the *start* of the message. Since the message is being streamed through a fixed width interface, the default mode is to terminate the message with a marker at the *end*. Producing precise bursts has value in that the receiver can take advantage of the early knowledge of the message length, but precise bursts may have a cost in latency or storage when data must be buffered to determine the length of the message.

Declaring precise bursts at an input interface means that the worker requires precise bursts; it must know the length of a message at the start of the message. Declaring precise bursts at an output interface means that the worker will produce precise bursts. If a worker requiring precise bursts on input is connected to a worker that does not produce precise bursts on output, an adapting worker must be inserted which converts imprecise bursts to precise bursts.

Precise bursts are an optimization that allow use of message lengths that are available early, such as when a message is arriving from a network with a header that specifies its length.

**Aborts** are used when a worker producing a message finds that it must interrupt the message it is producing in such a way that the part that was already sent may be invalid. Declaring aborts at an input interface means that the worker can tolerate aborted messages and act accordingly. Declaring aborts at an output interface means that the worker will indeed abort messages after starting but before ending the message.

Aborts are a latency optimization such that a consumer can start processing a message before knowing if it will complete without aborts. This is used in network processing to reduce average latencies. An example is when a CRC check at the end of a network message will determine whether the message is valid or not. Waiting until the CRC is checked before processing a message adds latency in the common case where the CRC is good.

If a worker that produces aborts is connected to a worker that can not tolerate aborts, a buffering adapter worker must be inserted which buffers the message and discards it if it is aborted.

The possible XML attributes of the **streaminterface** element are in the following table:

Table 11: XML Attributes of StreamInterface Elements

| Attribute in StreamInterface | Attribute data type | Attribute Description                                                                                                                                                                                                                                                                                             |
|------------------------------|---------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <b>name</b>                  | string              | The name of the port in the OCS. Required.                                                                                                                                                                                                                                                                        |
| <b>datawidth</b>             | unsigned            | The width of the data path for this interface. The default is the smallest element in the message protocol indicated in the OCS, unless overridden by a default <b>datawidth</b> attribute at the top level of this OWD ( <b>Hd1Worker</b> )                                                                      |
| <b>impreciseBurst</b>        | boolean             | Whether this interface tolerates (for input) or produces (for output) imprecise bursts. Default is <i>true</i> unless <b>preciseBurst</b> is <i>true</i> , when the default is <i>false</i> . For input, it accepts imprecise bursts. For output, it may produce imprecise bursts. Most interfaces use imprecise. |
| <b>preciseBurst</b>          | boolean             | Whether this interface requires (on input) or may produce (on output) precise bursts. The default is <i>false</i> . If a worker can easily (without extra overhead) support precise bursts on output, it should.                                                                                                  |
| <b>abortable</b>             | boolean             | On input, that aborted messages are tolerated. On output, that the worker may abort messages.                                                                                                                                                                                                                     |
| <b>pattern</b>               | string              | Signal naming pattern. This applies to external signals, and not the inner worker signals. The default is "%s_ ", which is simply the port name as prefix with an underscore.                                                                                                                                     |
| <b>clock</b>                 | string              | The name of the clock to be used with this port. The name is either the name of another port with a "myclock" attribute of <i>true</i> , or the name of a clock declared for the worker as a whole, that is not the control clock.                                                                                |
| <b>myclock</b>               | boolean             | An indication that this port operates in its own clock domain, and will have its own <b>clk</b> signal as input.                                                                                                                                                                                                  |

Although a number of attributes of data interfaces are inferred/derived from the message protocol specified in the OCS/OPS, the OCS may not have a message protocol or certain attributes may need to be overridden. Below are those attributes that can be specified in a StreamInterface element. The indicated defaults apply when there is no message protocol in the OCS.

*Table 12: Stream Interface Attributes for Overriding Protocol Defaults*

| Attribute in StreamInterface      | Attribute data type | Attribute Description                                                                                                                      |
|-----------------------------------|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------|
| <code>numberOfOpCodes</code>      | unsigned            | The number of distinct message types (opcodes) for the interface. The default is 1, the maximum is 256.                                    |
| <code>maxMessageValues</code>     | unsigned            | Largest number of data values in a message. Default is 64K.                                                                                |
| <code>dataValueWidth</code>       | unsigned            | The size of the atomic units of data on this interface (size of a byte). The default is the value of the <code>datawidth</code> attribute. |
| <code>dataValueGranularity</code> | unsigned            | The smallest multiple of <code>number of data values</code> that will actually appear in any message.                                      |
| <code>zeroLengthMessages</code>   | boolean             | Whether the interface will support zero length messages. Default is false.                                                                 |

Specifying the StreamInterface attributes determines the signals present in the input and output records for the interface.

Input and output stream interfaces are essentially FIFOs where the `ready` signal, that is *always input to the worker*, is indicating that data can move. On input interfaces it indicates that the FIFO is not empty. On output, it indicates that the FIFO is not full.

The contents of the FIFOs are **words**, which are a combination of data and metadata. The metadata associated with every word in the FIFO includes:

- SOM: start of message: indicates that this word is the first in a message, *whether or not there is data present also*.
- EOM: end of message: indicated that this word is the last in a message, *whether or not there is data present*.
- Valid: whether the data part of this word is valid
- Abort: (optional) whether this word is indicating the end of an aborted message
- Byte\_enable: (optional) indicates, *when valid is true*, which bytes in the data are valid.

The three metadata bits, `SOM`, `EOM`, and `valid`, can be in any combination, with the following meanings:

*Table 13: Metadata in Stream Interfaces*

| <b>SOM</b> | <b>Valid</b> | <b>EOM</b> | <b>Signal Description</b>                              |
|------------|--------------|------------|--------------------------------------------------------|
| 1          | 0            | 0          | The start of a message, without any associated data.   |
| 1          | 1            | 0          | The start of a message, coincident with data.          |
| 1          | 0            | 1          | A zero length message, with no data, in a single word. |
| 1          | 1            | 1          | A single word message.                                 |
| 0          | 0            | 0          | Reserved                                               |
| 0          | 0            | 1          | A trailing end of message, with no data.               |
| 0          | 1            | 0          | A data value in the middle of a message                |
| 0          | 1            | 1          | A data value, coincident with the end of the message   |

The signals for stream interfaces (input or output) are in the following table.

Table 14: Stream Interface Input and Output Signals

| Signal                   | When Included               | Direction      | Signal Description                                                                                                                                                                                                                                   |
|--------------------------|-----------------------------|----------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>clk</code>         | When myClock or Clock       | Input          | The clock for this interface, only when the interface uses a specific clock. Otherwise the <code>ctl_in.clk</code> signal should be used.                                                                                                            |
| <code>ready</code>       | Always                      | Input          | Word is present. On input: some combination of <code>valid</code> , <code>som</code> , <code>eom</code> , or <code>abort</code> are asserted. Worker can assert <code>take</code> (input) or <code>give</code> (output)                              |
| <code>data</code>        | If datawidth > 0            | Data direction | The input or output data, when <code>valid</code> is true.                                                                                                                                                                                           |
| <code>som</code>         | Always                      | Data direction | The start-of-message indication if present. For output ports, if previous <code>give</code> indicated <code>eom</code> , then this <code>som</code> indication is assumed true.                                                                      |
| <code>eom</code>         | Always                      | Data direction | The end-of-message indication is present                                                                                                                                                                                                             |
| <code>valid</code>       | If datawidth > 0            | Data direction | The data signals hold message data.                                                                                                                                                                                                                  |
| <code>abort</code>       | If Abortable                | Data direction | The message is being aborted                                                                                                                                                                                                                         |
| <code>byte_enable</code> | datavalue-width < datawidth | Data direction | Which data bytes are valid, usable when valid is true.                                                                                                                                                                                               |
| <code>opcode</code>      | numberOf OpCodes > 1        | Data Direction | The opcode for the current message. Valid from start of message to end of message on input. Must be valid with <code>som</code> on output.                                                                                                           |
| <code>give</code>        | Port is output              | Output         | Indicates word is given to output port, allowed when <code>ready</code> is true. Like an enqueue signal to a FIFO. When asserted, at least one of <code>som</code> , <code>eom</code> , <code>valid</code> , or <code>abort</code> must be asserted. |
| <code>take</code>        | Port is input               | Output         | Indicates that word is being taken by worker, only permissible when <code>ready</code> is true. Worker is saying "I take". Like a "dequeue" signal to a FIFO.                                                                                        |

The `data` and `byte_enable` signals are `std_logic_vector`. The opcode signal is also `std_logic_vector` when there is no protocol in the OCS, otherwise it is an

enumeration of type `<protocol>_OpCode_t`, with each operation having an enumeration constant `<protocol>_<op>_op_e`. The other signals are all `bool_t`.

Here is a set of examples, showing the use of these signals. The first is a complete worker that is purely combinatorial, and simply adds a constant to every data value from input to output. No process or clocking or even reset since `ctl_in.is_operating` reflects the reset condition, and the computation takes place in a single clock cycle. No opcode or byte\_enable is used since the protocol has a single operation

```
architecture rtl of example_worker is
    signal doit : bool_t;
begin
    doit      <= ctl_in.is_operating and in_in.ready and out_in.ready;
    in_out.take <= doit;
    out_out.give <= doit;
    out_out.data <= std_logic_vector(unsigned(in_in.data) + 3);
    out_out.som <= in_in.som;
    out_out.eom <= in_in.eom;
    out_out.valid <= in_in.valid;
end rtl;
```

The next example shows a worker that inserts a special single-word all-ones message, with operation `xy`, every 8 messages it passes through. The OCS protocol is `myprot`.

```
architecture rtl of example_worker is
    signal count  : unsigned(3 downto 0);
    signal ready  : bool_t := ctl_in.is_operating and out_in.ready;
    signal insert : bool_t := count = 8;
begin
    process (ctl_in.clk) is
    begin
        if rising_edge(ctl_in.clk) then
            if its(ctl_in.reset) then
                count <= (others => '0');
            elsif its(ready) then
                if its(inserting) then
                    count <= (others => '0');
                elsif in_in.ready and in_in.eom then
                    count <= count + 1;
                end if;
            end if;
        end if;
    end process;
    in_out.take      <= ready and in_in.ready and not inserting;
    out_out.give    <= ready and (count = 9 or in_in.ready);
    out_out.data    <= (others => '1') when inserting else in_in.data;
    out_out.som     <= inserting or in_in.som;
    out_out.eom     <= inserting or in_in.eom;
    out_out.valid   <= inserting or in_in.valid;
    out_out.opcode <= myprot_xy_op_e when inserting else in_in.opcode;
end rtl;
```

### 3.7.4 Message Data Interfaces

This interface is used when an OWD specifies a MessageInterface element associated with a data port in the OCS. It provides an alternative mechanism to consume or produce data that uses an addressable message buffer interface. This enables the worker to produce or consume message data out of order, or to only access parts of a message. The signals provided with this interface allow the worker to address specific locations in the *current buffer*, and then signal that it is done with the current buffer.

[unsupported in 1.0]

### 3.7.5 HDL Worker Data Interface Summary

- Only move data when `ctl_in.is_operating` is true.
- Input data is valid only when explicitly indicated at the input interface.
- Output data can only be moved when flow control allows it at the output interface.
- Synchronize control signals if data interface clocks are different than control clock.
- Output message boundaries must be supplied and must respect maximum message sizes.
- Output message boundaries are usually derived from input message boundaries.
- Remember to (in most cases) convey zero length messages from input to output.

### 3.8 Time Service Interfaces

This interface provides “time of day” information to the worker, to the precision requested in the OWD via attributes to the `timeinterface` element. Time of day values are supplied to the worker in the clock domain of this interface, which defaults, like all interfaces, to the control clock.

The signals for the time service interface are in the `time_in` signal port record and are described below.

*Table 15: Time Service Signals*

| Signal<br><code>in time_in</code> | Width                                                                             | Signal Description                                                                                                                                                                                                                                                                                                                                      |
|-----------------------------------|-----------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>seconds</code>              | <code>SecondsWidth</code><br>attribute of<br><code>TimeService</code><br>element  | The whole seconds part of the time-of-day, in GPS time (no leap seconds). If the width is 32 it is absolute time. If width less than 32, it is just a relative time truncated preserving the LSB, to that value, and wraps. The LSB is always 1 second VHDL type is IEEE numeric unsigned. Width may be zero, in which case this signal is not present. |
| <code>fraction</code>             | <code>FractionWidth</code><br>attribute of<br><code>TimeService</code><br>element | The binary fraction of a second, with the radix point to the left of the MSB. If width is 32 bits, the LSB represents $2^{-32}$ seconds, or ~233 ps. If width less than 32, the MSB are preserved, such that the MSB is always $\frac{1}{2}$ second. Width may be zero, in which case this signal is not present.                                       |
| <code>valid</code>                | 1 ( <code>bool_t</code> )                                                         | Indicates when the time of day is valid. Present only when the <code>AllowUnavailable</code> attribute is true.                                                                                                                                                                                                                                         |

### **3.9 *MemoryService Interfaces***

This interface provides access to memory. [Not supported in 1.0]

## 4 Building HDL Assets

Building workers is similar across differing authoring models and languages: typing the `make` command in a worker or library or project directory ends up building workers for a specified list of targets. If the worker is relying on lower level libraries, they are referred to in the worker or library `Makefile` using the `Libraries` variable, and must be built first.

For RCC workers, the list of targets is specified in the make variable is `RccTargets`. For HDL workers it is `Hd1Targets`. The default target for software workers is the development system itself (e.g. `linux-c6-x86_64`). For HDL workers, there is no inherent default, but a default setting of the `Hd1Targets` variable can be set in the worker, library or project `Makefile`, or in the project's `Project.mk` file. As with other authoring models, the variable can also be singular (`Hd1Target`) when there is only one target.

For software workers, this is usually the end of the build process: deployable artifacts for these workers are created and ready for export and/or use in applications.

For HDL workers, it is different. FPGAs are generally not subject to dynamic, partial loading: the whole FPGA must be reloaded with a full “configuration bitstream”. [OpenCPI does not use or depend on *partial reconfiguration* of FPGAs as of this document version]. As with any authoring model, primitives are built first. Then HDL workers are built and, for targets that are real FPGAs rather than simulators, synthesized. Finally there are two more steps to the build process in order to create the final, dynamically loadable configuration bitstream:

- Composing workers into an ***HDL assembly***.
- Finalizing the bitstream as an ***HDL container***.

This final step creates the deployable artifact usable for export and/or use in applications. These steps are defined in sections below.

### 4.1 HDL Build Targets

**Build targets** answer the question: what target device (or family of devices) am I building for? (compiling, processing, synthesizing, place-and-routing, etc.) When building any level of modules for FPGAs, the build targets are specified via the `Hd1Targets` and `Hd1Platforms` variables. The *targets* are chips or chip families, whereas the *platforms* are actual FPGAs on specific boards. HDL Primitives, workers, and assemblies, are built for `Hd1Targets`, and HDL containers (final bitstreams) are built for `Hd1Platforms`. These build targets are defined in a hierarchy with these levels:

**Top level, vendor level:** this level specifies vendors (Xilinx, Altera, Microsemi), as well as vendor-independent simulators (Icarus, Verilator, Modelsim). Thus HDL assets can be “built for Xilinx” or “built for Modelsim”. This implies building for all lower level targets under these top-level labels. The value `all` specifies all top level supported targets.

**Family level:** this level specifies the family of parts under the vendor level. Different part families typically have different on-chip architectures, and may drive tools differently. Building for a family target means generating libraries or cores that are suitable to any member (part) in the family. Examples would be “virtex5” or “spartan6” or “stratix4”. Simulation targets at the top level don’t have families (yet) so these top two levels are the same for simulation.

**Part level:** this level specifies the specific part that the design is targeted at, such as xcv5lx50t. This level does not include package information but may include speed grades.

Anywhere that HDL building takes place, these two Make variables can further filter the targets that are built:

**ExcludeTargets/ExcludePlatforms:** this variable specifies targets to be excluded, usually because they are known not to be buildable for one reason or the other (a tool error, or other incompatibility).

**OnlyTargets/OnlyPlatforms:** this variable specifies targets to be exclusively included, because it is known that only a limited set of targets should be built (e.g. a Xilinx coregen core specific to a particular family or part).

The **HdIPlatforms** variable specifies HDL platforms (like Xilinx ml605, zed), which imply the appropriate family and part. I.e., if you specify to build for a platform, it will build primitives and workers for the appropriate part family. Except for the final bitstream build, the **HdITarget(s)** are implied by the **HdIPlatform(s)**.

In some synthesis cases, tools that target a *part family* in fact target the *smallest* part in the family and try to limit use of on-chip resources to the amount that exist on the smallest part. While this correctly generates a result that can be used on any part in the family, it is not always desirable when the target platform in fact has a larger part. To force the use of a specific part for an HdITarget, in the HDL worker Makefile, insert a line setting the HdIExactPart variable, e.g.:

```
HdIExactPart=\n    $(and $(filter virtex6,$(HdITarget)))
```

If a worker is being built for only one target in any case, then this would work:

```
HdIExactPart=xc6vlx240t
```

## 4.2 The HDL Build Hierarchy

OpenCPI FPGA bitstreams (the files that configure entire FPGAs) are built in several layers. The same layers apply to building executables for simulation. The following diagram shows the build flow (bottom to top) and hierarchy.



Figure 3: OpenCPI HDL Build Flow Layers

At the bottom (built first, used by all other layers), are **primitives**. These are low level, “leaf” libraries and cores used by higher levels. **Primitive libraries** are libraries of modules built from HDL source code that are available to be used higher up the hierarchy; using a primitive library in a higher level module does not imply all the modules in the library are brought into the design, but only pulled in as needed by references in the higher levels of the design.

**Primitive cores** on the other hand are single modules built from source or generated from tools such as Xilinx Coregen, which are also used in higher levels of design. They are explicitly included in workers. Primitives may in fact depend on each other: a core may depend on primitive libraries, and primitive libraries may depend on other primitive libraries. Circular dependencies are not supported.

There are primitive libraries specific to vendors and families that can be used for implementing primitives using vendor-specific elements. More detail on creating such primitive libraries are in the **OpenCPI HDL Platform Development** document.

Above the **primitives** layer there are HDL workers of several types. All types of workers can use primitive libraries or cores as required. Application workers are generally portable and hardware independent. Device workers are workers that also connect to the I/O pins of external hardware, and in some cases can attach to vendor-specific on-chip structures (e.g. ICAP on Xilinx). Adapter workers are used when two connected workers are not connectible in some way due to different interface choices in the OWD (e.g. width, stream-vs-message, clock domains). Platform workers are special device workers that perform platform-wide functions.

The application assembly is automatically generated HDL source code that uses application workers and adapter workers. The application assembly itself is described in metadata (XML) as an assembly of connected application workers. It represents the subset of an overall heterogeneous OpenCPI application: a subassembly that will be executed on a single FPGA.

The platform configuration is automatically generated HDL source code that uses platform workers along with some device workers. It represents a platform configured with built-in support for some devices, and may include various constraints and physical design. For those familiar with Linux kernels, a platform configuration is analogous to a “built/configured kernel with some device drivers built-in”.

The container adapts the application assembly to a platform configuration and provisions any additional required device workers. It connects and adapts the “external I/O ports” of the HDL assembly to the available I/O paths and devices in the platform. When the deployment of the HDL assembly requires device workers that are not in the platform configuration, they are instanced in the container itself. Here the Linux kernel analogy is that these extra device workers are analogous to the “dynamically loaded device drivers used to run the application”.

Thus device workers can either be built into the platform configuration or instanced in the container.

The final design for the entire FPGA is the container logic. This hierarchy (except primitives) is shown in the following diagram.

FPGA Module/Board/*Platform*, w/hardware elements attached to FPGA: e.g. ML605

FPGA Chip/Substrate: e.g. Virtex6 lx240t on ML605 attached to devices



Figure 4: HDL Full FPGA Hierarchy

When tools support it, each layer in the build is actually synthesized or precompiled or elaborated as the tools allow (i.e. “prebuilt”). Thus:

- Each worker in a component library is prebuilt into a core (possibly using primitive libraries)
- Assemblies are prebuilt from generated VHDL or Verilog code using the required worker cores
- Platform configurations are prebuilt from platform workers and device worker cores
- Container top levels are built from platform configurations and HDL assemblies, with any additional device workers, service modules and adapters as required.
- Full bitstreams (or simulation executables) are built from the container modules.

This “layered prebuilding” allows the results to be reused at the next level without recompiling or resynthesizing, all in a vendor independent fashion. E.g. an HDL assembly prebuilt for a Xilinx virtex6 part can be reused to target different virtex6-based platforms. The exact definition of prebuilding varies with different tool chains, and the level of synthesis optimization that happens at each step also varies by tool, and some of this level of “hardening” at each level is controllable for some tools.

At one extreme, prebuilding simply means remembering which source files must be provided to the next level (for tools that have no precompilation of any kind). At the other extreme are tools that can incrementally synthesize to relocatable physically mapped blocks on a family of FPGA parts.

Simulators are considered HDL platforms that act as test benches for assemblies. This is described in more detail below in the simulation section.

### **4.3 HDL Directory Structure**

In OpenCPI projects, any HDL workers are in the **components** directory with other non-HDL workers. There can also be multiple separately named component libraries *under* the **components** directory. There is also a top-level **hdl** directory under which are the following directories:

**primitives:** This directory contains subdirectories for each primitive library or core.

**assemblies:** This directory contains subdirectories for each HDL assembly of application workers, and is where containers deploying these assemblies on platforms are built into bitstreams and simulation executables. **devices:** This directory is a component library containing subdirectories for each HDL device worker for devices that are potentially usable on different platforms. Software emulators for some of the devices are also in this component library.

**devices:** This directory contains a component library consisting of HDL devices workers and their proxies, which may be used in multiple platforms.

**platforms:** This directory contains subdirectories for each platform implemented in the project. Platforms are either a specific FPGA chip part on a circuit board with attached devices, or simulators. This is where platform-specific worker code exists, and where platform configurations are specified and built. There may also be a subdirectory for each platform containing a library of HDL device workers specific to that platform.

Development for HDL devices and platforms is described in the ***OpenCPI Platform Development Guide***.

Application workers (for all authoring models) are found in component libraries, which are either standalone outside of projects, in projects, or part of the ***OpenCPI CDK***.

This directory hierarchy (at the time of the writing of this document) is shown in the following diagram. All directories are optional and are created as needed by the **ocpidev** tool described in the CDG..



Figure 5: OpenCPI HDL Directory Structure

## 4.4 HDL Search Paths when Building

When building at any level of the HDL build hierarchy, the asset being built can depend on other assets, usually at lower levels of the hierarchy. HDL primitive libraries and cores can depend on other HDL primitive libraries. Workers can depend on primitives, specs, and protocols. HDL assemblies always depend on workers in libraries.

In each case of a dependency, the underlying asset is found using search rules. The built-in search rules automatically find assets that are in the component library, when building workers. They also automatically find assets that are in the current project, when building HDL primitives, workers, and assemblies in projects. Special action by the developer is only required when there are dependencies on assets outside the current library or project.

### 4.4.1 Searching for HDL Primitives

When a worker depends on primitive libraries, it specifies this by putting the library name in a list in the **Libraries** variable in its **Makefile**. Similarly, when an HDL worker depends on primitive cores, it specifies this by putting the core name in a list in the **Cores** variable. When the worker is being built, these lists are used in the following way.

If the library or core name has slashes in it, it is treated as a pathname (absolute or relative), to the specific directory of the primitive library or core. Examples are:

```
Libraries=../../myprims /home/colleague/hisprims/funprims  
Cores=../../ourcores/fft
```

If a name in the primitive library or core list does *not* have a slash, it is found by searching in the following places, in order:

- The HDL primitive libraries/cores in the current project (if the worker is in a project).
- The HDL primitive libraries/cores in the directories listed in the environment variable **OCPI\_HDL\_PRIMITIVE\_PATH**, which are colon separated and searched in order.
- The HDL primitive libraries/cores in the projects listed in the environment variable **OCPI\_PROJECT\_PATH**, which are colon separated and searched in order.
- The HDL primitive libraries/cores in the **OpenCPI CDK**, identified in the environment variable **OCPI\_CDK\_DIR**.

When it is convenient to put a list of primitive libraries in the library Makefile (making them available to all workers in the library), the name of the variable is specific to the authoring model. Thus to make a list of libraries available to all the HDL workers in a library, you would put the following line in the *library's Makefile*:

```
Hdllibraries=/home/george/primlibs/gprims
```

#### 4.4.2 Searching for XML files (OCS, OPS) when building workers

All workers have an OWD, and all OWDs depend on a component spec, normally found in a separate OCS XML file. Furthermore, OCS files frequently depend on a separate OPS (protocol spec) file. It is also possible that an OWD could include an XML file to incorporate a list of properties defined elsewhere.

When looking for these XML files, when their name has no slashes, it is found by looking in the following places, in order:

- The worker's directory
- The `gen` subdirectory of the worker's directory
- Directories specified in the space-separated list in the make variable `XmlIncludeDirs`
- The component library's export directory (`lib/hdl`) for referencing other HDL workers (e.g. for `slave` and `emulate` attributes)
- The component library's `specs` subdirectory.
- The export directories (`lib/hdl`) of all component libraries in the `ComponentLibraries` list (see below) for referenced HDL workers (e.g. for `slave` and `emulate` attributes)
- The directories listed in the environment variable `OCPI_XML_INCLUDE_PATH`, which are colon separated and searched in order.
- The `specs` subdirectories of all projects in the environment variable `OCPI_PROJECT_PATH`, which are colon-separated and search in order.
- The `specs` directory of the **OpenCPI** CDK, located using the environment variable `OCPI_CDK_DIR`.

#### 4.4.3 Searching for Workers in Component Libraries

The `Makefile` variable `ComponentLibraries` specifies a list of places to look when searching for workers. The most common use case for `ComponentLibraries` is for creating HDL assemblies, where the workers specified in the assembly must be found by searching for them in component libraries. Other uses include device workers, platform workers and platform configurations, as described in [HDL Platform Development](#).

While it might seem counterintuitive for a worker *inside* a component library to depend on other component libraries, there are three cases where this happens:

- A worker's OWD depends on specs (OCS and/or OPS) in another library.
- A worker is a proxy for a worker defined in another library (HDL workers cannot act as proxies).
- A worker is a device emulator for a device worker defined in another library.

In all these cases a worker, or a component library, might define the `ComponentLibraries` variable to specify this dependence.

To find workers, the search first looks in the library that the worker is already a part of. After this, the `ComponentLibraries` variable is used, which holds a list of component libraries to search.

If the component library name has slashes in it, it is treated as a pathname (absolute or relative), to the specific directory of the component library. If a name in the list does *not* have a slash, the component library is found by looking in the following places, in order:

- The component libraries in the directories listed in the environment variable `OCPI_COMPONENT_LIBRARY_PATH`, which are colon separated and searched in order.
- The other component libraries in the same project.
- The component libraries exported by the projects listed in the environment variable `OCPI_PROJECT_PATH`, which are colon separated and searched in order.
- The component library in the *OpenCPI CDK*, identified in the environment variable `OCPI_CDK_DIR`.

In all cases, to be found a worker must have been built. If the worker was found in another project, the libraries in that project must have been built.

## 5 HDL Primitives [From here on this document is not updated]

HDL primitives are organized into libraries of smaller units of modularity than workers (workers are built in OpenCPI component libraries). HDL primitives are essentially either a precompiled group of source files, or a prebuilt/presynthesized core, either generated by other tools (i.e. Xilinx CoreGen), or a core for which source code is unavailable (a 3<sup>rd</sup> party core in edif or ngc format or possibly qxp format for Altera). In both cases the exported/installed library or core that results from building a primitive is something that can be referenced by worker designs simply by including the following lines in the HDL worker makefile:

```
Libraries=myutils
```

If the library name is a simple name (no slashes), then it is assumed to be installed in the CDK installation directory itself. Then the OpenCPI source tree is present, this directory is actually the “ocpi” subdirectory of the source tree. If the name has slashes, it is used as a specific directory elsewhere where the primitive has been built and installed (e.g. your own library of primitives).

So, if the worker source code instantiates a module from the primitive library, no further action need be taken other than including the line above in the HDL worker’s Makefile (or once for a component library in the component library’s Makefile). In particular, no other “black box” module or VHDL component declaration need be created or referenced by the worker. The CDK itself includes several primitive libraries, some only have source code, and others have prebuilt cores. Several HDL primitive libraries in OpenCPI are always included even when the Libraries variable is not set.

## 5.1 HDL Primitive Libraries

To create an HDL primitive library from source code, create a directory containing the source files to be precompiled, and include a “Makefile” that includes this line:

```
include $(OCPI_CDK_DIR)/include/hdl/hdl-lib.mk
```

With the “Makefile” above, just running “make”, will create the primitive library for the targets indicated in the HdlTargets “make” variable.

When you need to export the results of building the library somewhere other than where the sources are (in the target-\* subdirectory), you set the “HdllibInstallDir” variable to indicate where this and other libraries or cores should be “installed”, and then running “make install”. The default value of “HdllibInstallDir” for the primitive library is “lib” in the library’s directory, which is not particularly useful. In the OpenCPI core directory tree, all primitive libraries are installed under hdl/primitives/lib.

A primitive library from source code is built for a variety of targets in per-target directories such that any concrete target that a *worker* is built for will use appropriate files resulting from the build of the primitive libraries for that same target.

Primitive libraries may be written in VHDL or Verilog, but a variety of issues arise when workers written in one language access primitives written in the other language. The constraints vary widely with tools and their ability to precompile libraries etc. To make a VHDL primitive library work across most tools and be callable/instantiable from Verilog or VHDL, the library must include a VHDL file \*\_pkg.vhd containing component declarations for entities in the library. Even if the library is Verilog source code, the \*\_pkg.vhd VHDL file with component declarations should be present if the Verilog modules in the library will be used from VHDL. The package name in the file should normally be the same as the library’s name, but doesn’t have to be. There can be multiple \*\_pkg.vhd files in each library if multiple packages are required. Finally, if the packages have package bodies in separate files, those files should be named “\*-body.vhd”.

When a VHDL worker is written to use a primitive in a library, it must include a line to access the library. For a primitive library “foo”, and package named “bar”, the worker would contain the lines:

```
library foo; use foo.bar.all;
```

## 5.2 HDL Primitive Cores

Making a prebuilt/presynthesized core available for use by workers is similar to creating a primitive library from source, except that the make file contains the lines:

```
include $(OCPI_CDK_DIR)/include/hdl/hdl-core.mk
```

The difference is that, in addition to source files that are just precompiled for use by workers, the core is also synthesized for specific targets and made available for building workers as a prebuilt/presynthesized core. The files used to build the core can be a mix of source and prebuilt files. Such a primitive may have no source code at all (except the black-box module definition), in this case the Makefile might look like this (in a directory called “mycore”):

```
PreBuiltCore=mycore.ngc # suppress building from source files
OnlyTargets=xvc6lx240t # this core is only good for this part
include $(OCPI_CDK_DIR)/include/hdl/hdl-core.mk
```

In this case the scripts simply expect the core file to already exist, with only the black-box file in source form: mycore\_bb.v.

## 6 HDL Assemblies for Creating Bitstreams/Executables

An HDL assembly is simply the subset of a heterogeneous OpenCPI application that will execute on one FPGA, and thus is the part of some OpenCPI applications that will ultimately be put into an FPGA bitstream. This section describes how to define and build (synthesize) these modules and to ultimately turn them into bitstreams.

This process is essentially *deploying the assembly onto the platform*. More specifically, it is deploying the assembly onto a *platform configuration* that has been defined, prebuilt, and possibly preplaced/constrained in advance. We call the result of this “deployment” a “container” – the outer module that contains the application as well as the platform support modules in the platform configuration. Note that a platform configuration may be a simulation platform or a real platform.

Given that the platform configurations already exist, the steps taken to go from the assembly (described in XML) to a bitstream are:

1. Describe the assembly in XML
2. Select the platform configuration that the assembly will be deployed on
3. Specify how the assembly’s external ports connect to the platform (i.e. to an interconnect like PCIE for off-platform connections, or to local devices).  
This is called “defining the container”.
4. Run “make”, to generate the bitstream.

The actual steps taken by the OpenCPI scripts and tools are:

1. Generate the Verilog/VHDL code that structurally implements the assembly
2. Build/synthesize the assembly module, that has some “external ports”.
3. Generate the Verilog/VHDL container code that structurally combines the assembly and the platform configuration, as well as any necessary adapters.
4. Build/synthesize the container code, incorporating the assembly and platform configuration. This is the top level module.
5. Run the final tool steps to build the bitstream (map, place, route, etc.).

## 6.1 The HdlAssembly XML file that describes the assembly

The assembly is described in an XML file containing an HdlAssembly top-level element, which contains **worker instances** and **connections** and **external ports**. It is similar to the Application XML file that describes the whole OpenCPI heterogeneous application (as documented in the OpenCPI Application Guide).

The worker instances (“instance” subelements of the HdlAssembly) simply reference HDL workers in a component library, and optionally assign names to each instance. The “worker” attribute is the worker’s OWD name (without directory or model), and the optional “name” attribute is the instance name. When not specified, instance names are either the same as the worker name (when there is only one instance in the assembly), or the worker name followed directly by a zero-based decimal ordinal (when there is more than one instance of the same worker).

Connections among workers in the assembly use “connection” XML elements. They define connections among worker data interfaces. A trivial example would be:

```
<HdlAssembly>
  <Instance Worker="generate"/>
  <Instance Worker="capture"/>
  <Connection>
    <port name='out' instance='generate' />
    <port name='in' instance='capture' />
  </Connection>
</HdlAssembly>
```

For convenience, internal connections between the output of one instance to the input of another, can simply be expressed using the “connect” attribute of the instance, indicating that the instance’s only output should be connected to the only input of other instance whose name is the value of the “connect” attribute. Like this:

```
<HdlAssembly>
  <Instance Worker="generate" connect='capture' />
  <Instance Worker="capture"/>
</HdlAssembly>
```

Furthermore, when this shortcut is used, you can in fact specify the “from” port using the “from” attribute, and the “to” port using the “to” attribute. If these instance had multiple other input and output ports, you can also specify it this way:

```
<HdlAssembly>
  <Instance Worker="generate" connect='capture' from='out' to='in' />
  <Instance Worker="capture"/>
</HdlAssembly>
```

To specify external ports, the “external” element is used, which allows the name of the external port to be different from the worker port it is connected to:

```
<HdlAssembly>
  <Instance Worker="generate" connect='process' from='out' to='in' />
  <Instance Worker="process"/>
  <external name='procout' instance='process' port='out' />
</HdlAssembly>
```

But there is also a shortcut when the external port name is the same as the worker's port name, by simply using the "external" attribute of the instance:

```
<Hd1Assembly>
  <Instance Worker="generate" connect='process' from='out' to='in' />
    <Instance Worker="process" external='out' />
  </Hd1Assembly>
```

However, this is only usable for one port of the worker. If the assembly is in fact a single worker where both "in" and "out" ports should be external, you would need:

```
<Hd1Assembly>
  <Instance Worker="process" external='out' />
  <external instance='process' port='in' />
</Hd1Assembly>
```

Below is a diagram of a simple assembly, and the corresponding Hd1Assembly XML file. The application has a "switch" worker that accepts data either from its "in0" or "in1" interface, and sends the data to its "out" interface. The "delay" worker sends data from "in" to "out" implementing a delay-line function that requires attaching memory to it. The "split" worker takes data from its "in" interface and replicates it to both its "out0" interface as well as its "out1" interface. The local application has 4 external ports (ADC, SWIN, SWOUT, DAC), as well as a requirement of attached memory (to satisfy the memory requirements of the "delay" worker).



*Figure 4: Example Local Application*

Given that these three workers are already in a component library, the XML description of the above is below. Note that one of the workers (delay) in fact requires memory and thus would have a WMemI interface declared in its OWD, but this is not mentioned in the Hd1Assembly below since no additional information needs to be provided: the container will provide memory to the worker.

```
<HdlAssembly>
  <Instance Worker="switch" connect="delay"/>
  <Instance Worker="delay" connect="split"/>
  <Instance Worker="split"/>
  <External name='adc' instance="switch" port="in0"/>
  <External name='swin' instance="switch" port="in1"/>
  <External name='dac' instance="split" port="out0"/>
  <External name='swout' instance="split" port="out1"/>
</HdlAssembly>
```

*Figure 5: HdlAssembly for the example local application*

## **6.2 The Makefile for building an assembly**

Assemblies are built in their own directory, where the name of the assembly defaults from the name of the directory, and the contents of the assembly is described in the HdlAssembly XML file described above. The name of the XML file is simply the name of the assembly, with the “.xml” file extension. In addition to this description file, the Makefile must indicate which component libraries should be used to find the workers mentioned in the HdlAssembly file. A Makefile for the above application might be:

```
ComponentLibraries=/home/fred/project/components  
include $(OCPI_DIR)/include/hdl/hdl-assembly.mk
```

In this case the assembly will be based on the indicated component library, and the automatically generated assembly Verilog or VHDL module will be synthesized for all specified targets and platforms. If the OnlyTargets=virtex5 was specified, the assembly would only be built for use on virtex5 parts. Alternatively, the “HdIPlatforms” variable could be set to a particular platform (e.g. ml605), in which case the assembly would only be built for the family associated with that platform (virtex6).

The component libraries indicated in the ComponentLibraries variable provide a pathname to the library. If there are no slashes in the name of a component library it is assumed to be in the OpenCPI CDK. The standard libraries in the CDK are always included (last): “components” and “devices”. All component libraries must have unique base names. E.g., since the builtin libraries are called “components” and “devices”, you cannot have private libraries with those same names, regardless of their location.

The assemblies in the OpenCPI tree are built in the hdl/assemblies subdirectory of the OpenCPI source tree, but it could be anywhere as long as the OCPI\_CDK\_DIR variable is set.

When you have a directory full of assemblies like this (a “library of assemblies”), the top level Makefile can simply be:

```
Assemblies=assy1 assy2  
include $(OCPI_DIR)/include/hdl/hdl-assemblies.mk
```

This would build the listed assemblies. The HdIPlatforms variable could be set here also.

- 1 Specifying the container that deploys the application on the platform.

The container is the top-level module and deploys the assembly onto the platform. Separating the “assembly” from the “container” in this way keeps the assembly portable and hardware-independent. It doesn’t know what its input and outputs are connected to, and it would easily be run in a simulation testbench with no hardware.

In an assembly’s directory, containers are specified in two ways. The first, is “default containers”. Default containers are generated by looking at the assembly, and connecting all the external ports to the platform’s interconnect (e.g. PCIE). Thus the “default” container specification is to connect every external port of the assembly such that it connects external to the platform, to connect to workers running on containers on other platforms. [Note that external connections like this are currently not supported in simulation platforms].

The “DefaultContainers” variable is used to list platform configurations for which default containers (and thus bitstreams) should be automatically generated for this assembly. The format of the items in this list is <platform> or <platform>/<configuration>. If this variable is defined as empty:

```
DefaultContainers=
```

Then there are no DefaultContainers for this assembly at all. If the DefaultContainers variable is not set at all (not mentioned in the assembly’s Makefile), then default containers (and bitstreams) will be generated for whatever platform the assembly is built for. In this case the platform configuration is assumed to be the “base” platform configuration for the platform (the one with no device workers at all).

So, if nothing is said at all about containers in the Makefile, default container bitstreams will be built for whatever platforms are mentioned in “HdIPlatforms”. In many cases this simple default case is all that is required (no container variables at all).

Default containers cannot be used to connect the external ports of the assembly to something *other than* the single system interconnect of the platform. I.e. if there are multiple interconnects (say PCIE and Ethernet), or if connections to local devices are required, then a container must be specified in an XML file. The “Containers” variable specifies a list of containers (container xml files) that should be built in addition to those indicated by the “DefaultContainers” variable (or absence thereof). The “Containers” variable does not suppress the building of the default containers. If you want that, you should specify an empty “DefaultContainers” variable.

An example Makefile that relies only on the CDK component library and builds default containers for whatever platform is specified is:

```
include $(OCPI_DIR)/include/hdI/hdI-assembly.mk
```

A Makefile that builds a default container on the ml605 base platform and the “lime\_adc” configuration of the “alst4” platform, and further generates a specific container called “in\_2\_adc” for connecting some external port to the ADC device on that latter platform configuration might look like:

```
DefaultContainers=ml605 alst4/lime_adc
Containers=in_2_adc
include $(OCPI_DIR)/include/hdI/hdI-assembly.mk
```

### **6.3 Hdl Containers in an XML file**

A container XML file is required to connect to multiple interconnects (or not the first one) or to local devices.

The top-level element of the container XML file is the <HdIContainer> element. Its primary attribute is the “platform” attribute to specify the platform configuration that should be targeted by the container. It is of the form <platform> or <platform>/<configuration>. Beyond specifying the platform configuration in the “platform” attribute, the subelements of the < HdlContainer> top-level element are special connection elements for containers.

They are “<connection>” elements with these attributes:

- external: specify an external port on the assembly
- device: specify a device on the platform
- interconnect: specify an interconnect on the platform
- port: when a device has more than one data port, specify it

Using these attributes you can specify these connections:

- external to interconnect
- external to the only port of a device
- external to a specific port of a device
- a device port to an interconnect (bypassing the assembly altogether) [not yet supported]

So, in a case where the “in” of the assembly connects to a locally attached “adc” device, but the “out” of the assembly is attached to the interconnect for communicating to other FPGA or software containers, you would have:

```
<HdIContainer platform='alst4/alst4_conf1'>
    <connection external='in' device='adc' />
    <connection external='out' interconnect='pcie' />
</HdIContainer>
```

The connection to the “adc” device might be resolved in the platform configuration (if the “adc” device was already included in the “alst4\_conf1” configuration of the “alst4” platform. If not, then the “adc” device logic would be instantiated in the container itself.

#### *6.3.1 Service connections in a container*

OpenCPI HDL workers have three types of ports: control, data, and service. Service ports are connected locally in the container to provide the required services to workers. Service ports are implementation-specific for a given worker and thus not found in “spec” files. I.e., the worker declares what services it needs for its particular implementation. The services currently defined are memory and time (time of day).

If workers in the assembly have service ports, they automatically become external ports of the assembly. When generating a container, all services required by the assembly, as

well as any services required by device workers instanced in the assembly, must be satisfied by instancing the appropriate modules in the generated container code.

Time service requirements are satisfied by 1) instancing a “time client” module for each worker that needs “time of day”, and 2) connecting that “time client” to the timekeeping infrastructure on the platform. Each time client is instanced based on requirements of the associated worker’s time port. These requirements are precision (number of bits in the “seconds” and “fraction of a second” fields) and whether the worker can tolerate time-of-day becoming available after it starts running, or whether it simply assumes it is always available.

Memory service requirements [which are currently not automated as of this writing], are satisfied by instancing either private BRAM modules (private to the worker), or instancing external memory access interfaces, connected to device workers for external memory. Memory access may also be multiplexed to support multiple workers sharing the same memory. Note that such memory services are unrelated to buffering used to connect workers together or connect them to devices and interconnects.



Figure 6: OpenCPI HDL Assembly on Container and Platform

### 6.3.2 Preparing the OpenCPI runtime metadata for the bitstream

When OpenCPI runs an application that includes HDL workers and thus includes an HDL assembly bitstream, it requires a small runtime metadata file that tells it what the bitstream contains (which workers and how they are connected, etc.). This file is automatically generated as part of the same process that builds the bitstream, and is attached to the bitstream file after it is created by the FPGA back-end place-and-route tools. This file is also compressed and placed in a BRAM on chip. Thus software can discover the contents of a bitstream either by looking at the file to be loaded, or, looking

at the file inside the loaded bitstream at runtime. This allows the control software to know what a bitstream can offer before it is loaded, and also know what the currently loaded bitstream can offer, without having access to the file that loaded it.

This runtime XML is generated whenever a bitstream is generated, and placed in a BRAM (compressed) and attached to the file (uncompressed).

The bitstreams are in fact compressed, thus the resulting “.xxx.gz” file has the metadata attached to the file (after the file is compressed, but with the XML not compressed). This compressed bitstream file, with uncompressed metadata attached, is what is provided as the “binary artifact” for HDL assemblies, as installed in an OpenCPI runtime component library as referenced by the OCPI\_LIBRARY\_PATH environment variable. It is the format expected by the internal mechanisms to load bitstreams onto platforms at runtime.

### 6.3.3 *HDL device naming*

HDL devices that can support OpenCPI containers have unique names within an OpenCPI system context. Each name starts with a prefix indicating how the device is discovered and controlled by OpenCPI software. Note that “device” in this context is an instance of an HDL platform in an OpenCPI system, rather than a device inside of an HDL platform attached to the FPGA chip.

The mechanisms currently supported are:

#### **PCI**

FPGA devices/boards accessible by PCI express

#### **Ether**

FPGA devices accessible via link-layer Ethernet

#### **Sim**

FPGA devices that accessible via UDP that are in fact simulation servers (see “simulation devices” below)

#### **UDP**

FPGA devices that are accessible via IP/UDP

#### **PL**

FPGA device in a Zynq SoC accessible via the on-chip AXI interconnect.

The full device name is of the form:

`<control-scheme>:<address-for-control-scheme>`

As a convenience, if there is no known prefix in the name, then if there are 5 colons in the device name, it is assumed to be an **Ether** device name, otherwise it is assumed to be a **PCI** device name.

#### 6.3.3.1 *PCI-based HDL devices*

HDL devices on the PCI express bus/fabric are identified by the syntax common to many PCI utilities such as “lspci” on Linux, namely:

`<domain>:<bus>:<slot>.<function>`

An example is:

`0000:05:00:0`

Since it is common to have “domain”, “slot”, and “function” all being zero, if the address field in the device is simply a number with no colons, it is assumed to be the bus number with the other fields being zero. Thus “`pci:5`” implies “`pci:0000:05:00.0`”. Since an identifier with no prefix and not having 5 colons is assumed to be a PCI device, the identifier “4”, is assumed to be “`pci:0000:04:00.0`”.

A common example of a PCI device is the Xilinx ML605 development board. Another is the Altera Stratix4 development board (called `alst4` in OpenCPI).

#### 6.3.3.2 *Ethernet-based HDL devices*

HDL devices that are attached to Ethernet and operate at the link (or MAC) layer (OSI layer 2) use the “ether” prefix. This prefix implies access without using any routing or transport protocols such as IP/UDP or IP/UDP. It is the fastest and lowest latency way to use a network, with the drawback that it cannot be “routed” through IP routers, but can only be “switched” by L2 Ethernet bridged and switches. The syntax for naming such devices is:

`Ether:[<interface>/]<mac-address>`

The MAC-address is the typical 6 hexadecimal bytes separated by colons, such as:

`c8:2a:14:28:61:86`

The optional `<interface>` value is the name of a network interface on the computer accessing the device. Typical examples are “`en0`” or “`eth0`”. It is optional when there is only one such device. On systems with multiple interfaces it indicates which one should be used to reach the device. This is needed since there is no routing at this level of the network stack: you must use the right network interface to reach the addressed device.

The available network interfaces can usually be identified by the “`ifconfig`” linux command. There is a more special purpose command of the **`ocpihdl`** utility described below that lists only the network interfaces available and usable for OpenCPI (the “`ethers`” command to **`ocpihdl`**).

#### 6.3.3.3 *Simulator device naming*

OpenCPI runs HDL simulators in a “server” that makes them look like any other device to software. When they are available (the server is running), they are discoverable like any other device. They are accessed using the IP/UDP network protocol, and thus use IP addresses. They are named according to this syntax:

`Sim:<IP-address>:<IP-port>`

The IP-Address is the typical “dotted” notation (e.g. “`10.0.1.100`”), and the IP port is a decimal number less than 65536.

## 7 HDL Simulation Platforms

## 8 The *ocpihdl* command-line utility for HDL development

The *ocpihdl* utility program performs a variety of useful functions for OpenCPI HDL development. These include:

- Searching for available FPGA devices (via PCI, Ethernet, UDP, etc.)
- Testing the existence of a specific FPGA device
- Reading and writing specific registers in an FPGA device
- Running a simulator server that executes simulators acting as FPGA devices
- Loading bitstreams on a device
- Extracting the XML metadata from a running device

The general syntax of *ocpihdl* is:

```
ocpihdl [<options>] <command> [<options>] [<command arguments>]
```

Options are the typical hyphen-letter options, some with arguments after the hyphen-letter argument. The following sections describe each command and its associated options and arguments. Here are the options that apply to many commands:

- v be verbose – put progress messages on standard output
- d <device>  
specify the HDL device for the command. See “HDL device naming” above.
- p <platform>  
specify the HDL platform type (e.g. ml605) for the command
- l <log-level>  
specify the OpenCPI logging level that should apply during execution
- P produce “parseable” output for some commands that read registers
- i <network-interface>  
the network interface (e.g. “en0”) to use for the command

The commands provided by the *ocpihdl* utility are summarized in the following table.

| Name      | Device | Worker | Interface | Description                                      |
|-----------|--------|--------|-----------|--------------------------------------------------|
| admin     | ✓      |        |           | Dump HDL device admin information                |
| bram      |        |        |           | Create BRAM file from input (XML) file           |
| deltatime | ✓      |        |           | Measure round-trip time for time synchronization |
| dump      |        |        |           |                                                  |
| emulate   |        |        | ✓         | Emulate a UDP or Ethernet device (admin space)   |

| Name     | Device | Worker | Interface | Description                                                 |
|----------|--------|--------|-----------|-------------------------------------------------------------|
| ethers   |        |        | ✓         | List available, up, and connected Ethernet interfaces       |
| probe    | ✓      |        |           | See if a specific device exists and responds                |
| load     | ✓      |        |           | Load a bitstream onto the device                            |
| getxml   | ✓      |        |           | Extract XML metadata from running device                    |
| radmin   | ✓      |        |           | Read specific addresses in HDL device admin space           |
| reset    | ✓      |        |           | Reset device (via control plane access)                     |
| rmeta    | ✓      |        |           | Read metadata space from device                             |
| search   |        |        | ✓         | List all discovered and responding HDL devices              |
| settime  | ✓      |        |           | Set the device's GPS time to the system time                |
| simulate |        |        |           | Run a simulator server                                      |
| unbram   |        |        |           | Create an XML file from a BRAM file                         |
| wadmin   | ✓      |        |           | Write admin space                                           |
| wclear   | ✓      | ✓      |           | Clear worker status errors                                  |
| wdump    | ✓      | ✓      |           | Dump/print worker control/status registers (not properties) |
| wop      | ✓      | ✓      |           | Perform control operation on worker (e.g. start)            |
| wread    | ✓      | ✓      |           | Read worker configuration property space                    |
| wreset   | ✓      | ✓      |           | Assert reset for worker                                     |
| wunreset | ✓      | ✓      |           | De-assert reset for worker                                  |
| wwctl    | ✓      | ✓      |           | Write worker control registers                              |
| wwpage   | ✓      | ✓      |           | Write worker page register (to reach full 32 bit space)     |
| wwrite   | ✓      | ✓      |           | Write worker configuration property space                   |

Table 1: The *ocpihdl* Commands

### **8.1 admin command – Print the device’s administrative information**

This command is used to dump all the information and state in the devices “admin” space, which is the information for the device and bitstream as a whole. This information is specified in the OpenCPI HDL Infrastructure Specification document. A device flag must be specified. There are no command arguments.

## **8.2 bram command – Create a configuration BRAM file from an XML file**

This command converts an XML file that is expected to contain the “artifact description XML” into a format that will be processed into a read-only BRAM inside the FPGA bitstream. This allows software to know what is inside a bitstream by only looking at the device, without needing a separate bitstream file. This command is used by the scripts that create bitstreams. The output format is an ASCII format that the tools can use to initialize memories. The two command arguments are <input-file> and <output-file>, e.g.:

```
ocpihdl bram my.xml mybram.bin
```

### **8.3 *deltatime command – Perform time synchronization test on device.***

This command, which requires a device option, uses special time-difference hardware in the OpenCPI FPGA bitstream to measure the round-trip time for accessing the FPGA in order to reduce the time-skew between the system time-of-day and the FPGAs time-of-day. It takes 100 samples, averages them, eliminates 10% outliers, and then retests the time-skew after applying the correction.

#### ***8.4 dump command – unimplemented***

This command, which requires a device option, uses the configuration BRAM to dump the state of all workers and their properties.

## **8.5 *emulate* command – emulate a network-based device (admin space only)**

This command acts as an Ethernet or UDP-based HDL device and responds to discovery and admin-space accesses. It is used to test software controls and network connectivity. It can optionally be supplied with a network interface option (-i) to specify on which network interface should the emulated device appear. With no network interface specified, it uses the first available that it finds (that is “up” and “connected”). A special case is the network interface whose name is “udp”, which means the command emulates an OpenCPI HDL device attached to the IP subnet of the host computer implementing discovery and control via UDP, e.g.:

```
ocpihdl -i udp emulate
```

## **8.6   *ethers command – display available (up and connected) network interfaces***

This command is used to list all the available Ethernet network interfaces on the system and whether they are “up” and “connected”. It also shows what the “default” interface is (the first listed that is up and connected), for commands that could take an interface (-i) option.

## **8.7 probe command – test existence and availability of device**

The probe command takes a device (as option or argument) and tries to contact it and see if it is responding. This should work whether it is running an application or not.

### **8.8 *load command – load bitstream***

The load command takes a device (as option or argument) and a bitstream file name argument and loads the bitstream onto the device.

## **8.9 *getxml command – retrieve XML metadata from device***

The getxml command takes a device (as option or argument) and a filename to write the XML data to. It retrieves and uncompresses the XML data stored in the running device and writes it to a file.

### **8.10 radmin command – read a specific address in device’s admin space**

This command reads an individual word of data from the admin space of the specified device. The command argument is the hexadecimal (starting with 0x) or decimal offset in the admin space. If the offset ends in “/n” where n is 4 or 8, then that specifies the size of the access in bytes. If there is no “/n”, then the access is a 32 bit access.

If the parseable option (-P) is specified the output is just the value returned in hexadecimal format. Otherwise a prettier message with the offset and value is printed. An example that does a 2 byte read of offset 12 would be:

```
ocpihdl -d pci:5 radmin 12/2
```

The layout of the admin space is defined in the OpenCPI HDL Infrastructure document.

### ***8.11 reset command – perform soft reset on device***

This reset command resets the device in a way that does not affect the control path to the device. For example on a PCI-based device, it would not damage or reconfigure the PCI express interface. All application workers and most other (infrastructure) workers are placed in reset and need to be specifically taken out of reset. [unimplemented].

### ***8.12 rmeta command – read specific addresses in the metadata space of the device***

This command works exactly like the **radmin** command except it reads the configuration BRAM space rather than the admin space.

### ***8.13 search command – search for all available HDL devices***

This command searches for all reachable HDL devices and reports what it finds. It uses all the supported control paths (PCI, Ether, UDP/Sim). If an interface (**-i**) option is specified it limits the Ether search to that one interface.

### **8.14 settim command – set device's time from system time**

This command sets the time on the device from the current system time. It requires a device to be specified. The current time of the device is shown in the output of the **admin** command.

## **8.15 simulate command – run a simulation server for a specific simulator “platform”**

This command creates a simulated HDL device that is discoverable (by the “ocpihdl search” command, by the “ocpirun” utility, and via the Application Control Interface (ACI) C++ API). It is thus an available HDL device for running assemblies of workers for such a device, compiled into “bitstreams”. When “ocpihdl simulate” is running, the device is ready to “load and run a bitstream”, where the “bitstream built for the device” is in fact a “simulation executable” for running under a particular FPGA simulator (such as Modelsim or Istim).

This command initially emulates the device (similar to the “emulate” command, except as a “sim” type device) and is thus discoverable before loading any particular simulator executable. The platform option (-p) is required to specify which type of simulator should the simulated device use as its “platform”. The current choices are isim (Xilinx ISE), xsim (Xilinx Vivado), and modelsim (Mentor). [xsim currently non-functional due to limitations reported to Xilinx].

When some OpenCPI application (e.g. executed using the **ocpirun** utility command) decides to run an assembly of workers on this simulator-based device (as it would with any other discovered and available HDL device/FPGA), it would request that the bitstream (executable) be “loaded” and “started” on this device. This would cause this simulated HDL device to run the actual simulator (e.g. modelsim) with that executable. If the application was re-run, it would notice that the “bitstream” was already loaded and continue to use it, which would in fact re-use the device in the same simulation run.

If the simulated device is currently running when requested to “load a new bitstream”, it shuts down the simulator, preserving its results, and starts a new one with the newly requested bitstream, in a newly created subdirectory.

Each time a simulator is actually run under this “server”, it will execute in a new subdirectory created for that simulation run, with the name:

<assembly-name>.<sim-platform>.<date-time>

Thus running the “ocpihdl simulate” command in a directory will result in one or more subdirectories holding simulation results for each simulation run. If this simulated device is never asked to “load and run” a bitstream, no subdirectories will be created, even though the device will still be discoverable while this command is running.

The simulation run starts with the first successful “load and run” of a bitstream. Each simulation will continue until one of the following occurs:

- The code being simulated explicitly asks for the simulation to terminate via the \$finish system task in Verilog or an assertion in VHDL.[check VHDL details].
- The simulation run exceeds the control-plane clock cycle count provided by the ‘-T’ option to “ocpihdl simulate”
- This command receives a control-C.
- The simulated device is asked to load and run a new bitstream.

The results of any simulation run can be viewed by opening the file that records the execution with the viewer associated with the simulator being used. For Xilinx “isim”, the command (in the subdirectory for the simulation run) would be:

```
isimgui -view sim.wdb
```

For modesim, it would be:

```
vsim [fix this]
```

A separate command utility, “ocpiview” is used to conveniently view the results of simulation. If run in the same directory as “**ocpihdl simulate**”, it finds the most recently created simulation subdirectory, and runs the appropriate viewer based on which simulator was used. It can be given an optional argument specifying the directory name of the simulation to view. Thus a typical sequence is to run

```
ocpihdl simulate
```

and after that completes (based on control-C or other action), run

```
ocpiview
```

to look at the results.

### **8.16 unbram command – create an XML file from a config BRAM file**

This command reverses the function of the **bram** command, by converting a file formatted for initializing (during build) the configuration bram in a bistream back to the original XML file. The two command arguments are <input-file> and <output-file>, e.g.:

```
ocpihdl unbram mybram.bin my.xml
```

### **8.17 wadmin command – write specific addresses in the device’s admin space**

This debug command writes a 4-byte/32-bit or 8-byte/64-bit value in the device’s admin space at the specified offset. The size in bytes (4 or 8) is optionally specified in the first argument with a slash. The default size is 4. The syntax is:

```
ocpihdl wadmin -d <hdl-device> <offset>[/<size-in-bytes> <value>]
```

The value can be in hex (preceded by 0x) or decimal. An example, to write an 8 byte value at offset 0x20 with the value 12345, would be

```
ocpihdl wadmin -d PCI:5 0x20/8 12345
```

## **8.18 Worker Commands: commands that operate on individual workers**

The following worker commands operate on either a single worker or a sequential range of workers. They all require a device to be specified, and the first argument is either a single worker number or a set of worker numbers separated by commas. Thus to perform a worker command on worker 5 would be:

```
ocpihdl w<cmd> -d pci:5 5 <worker command args if any>
```

To perform the command on workers 2, 5, and 6, would be:

```
ocpihdl w<cmd> -d pci:5 2,5,6 <worker command args if any>
```

The worker commands are:

### *8.18.1 wclear command – clear a worker’s error registers*

Each worker has a set of control and status registers that are part of the control-plane infrastructure IP (not in the worker itself). This command clears the error and attention bits in the worker’s status register. A worker’s status can be displayed by the “wdump” command. An example for clearing error and attention status for worker 5 is:

```
ocpihdl wclear -d PCI:5 5
```

### *8.18.2 wdump command – dump a worker’s status registers (not properties)*

Each worker has a set of control and status registers that are part of the control-plane infrastructure IP (not in the worker itself). This command displays the current status of the worker by dumping these registers.

### *8.18.3 wop command – perform a control operation on a worker*

This worker command executes a control operation for the worker, directly accessing the hardware that makes the control operation request of the worker. Only the “start” operation is implemented on all HDL workers. Any others may have unpredictable or erroneous results when requested on a worker that doesn’t implement them. The available operations are:

- **initialize** – after reset is deasserted (see wunreset), request worker initialize itself
- **start** – put the worker into an operational state, after stop or initialize
- **stop** – suspend operation of the worker
- **release** – return the worker to the “pre-initialized” state
- **test** – run the worker’s built-in test
- **before** – inform worker that a batch of property settings will happen
- **after** – inform a worker that a batch of property reads has completed

### *8.18.4 wread command – read a worker’s configuration property space*

This command reads a value from the worker’s property space. The first argument is the offset in the worker’s property space (in bytes), with an optional size-in-bytes for the

access (1, 2, 4 or 8), and the second (optional) argument is how many sequential accessed to make. The default for the size is 4, and the default for the second argument is: 1. E.g., to read 3 single bytes at offset 6 from worker 11 would be:

```
ocpihdl read -d 5 11 6/1 3
```

See the “**wwpage**” command for workers whose property space is larger than 1 MByte ( $2^{20}$ ).

#### *8.18.5 wreset command – assert reset for a worker*

This command asserts the control reset signal into the worker. It stays asserted until the **wunreset** command is used.

#### *8.18.6 wunreset command – deassert reset for a worker*

This command deasserts the control reset signal into the worker, after which the “**wop initialize**” command can be issued (or, if the worker does not implement the “initialize” command, the “**wop start**” command can be issued).

#### *8.18.7 wwctl command – write a worker’s control register*

This command writes the worker’s control register, which is in the control plane infrastructure IP, not in the worker itself. The argument is a 32 bit value. The bit definitions are described in the “OpenCPI HDL Infrastructure” document.

#### *8.18.8 wppage command – write a worker’s window/page register*

A worker’s property space can be a full 32 bit space (4 GBytes), but to access more than the first 1MByte ( $2^{20}$  bytes), a “window” register in the control-plane infrastructure must be set. This command sets that register with the value in the first argument, which sets the high order 12 address bits (31:20) of the effective address when the **wread** and **wwrite** commands are issued. The offset in those commands supplies bits (19:0) of the effective address. Thus to read location 0x12345678 in worker 7 with a full 4GByte property space, two commands would be used:

```
ocpihdl wppage -d pci:5 7 0x123  
ocpihdl wread -d pci:5 7 0x45678
```

#### *8.18.9 wwrite command – write a worker’s configuration property space*

This command writes a single value into a worker’s property space at an offset (and size) specified in the first argument and a value specified in the second argument. The **wwpage** command applies to this command also, for workers with large property spaces. Thus to write location 0x20 in worker 6 with the 64-bit value 0x123456789abc:

```
ocpihdl wwrite -d pci:5 6 0x20 0x123456789abc
```

## 9 HDL Platform and Device development

Developing HDL platform workers and device workers is described in a separate ***OpenCPI Platform Development Guide*** document. It covers:

- Platform workers
- Platform configurations
- Device Workers
- Slot types
- Slots on platforms.
- Cards for slots
- Constraints

## **9.1 Outputs from `ocpigen`**

During the component implementation process, *for all authoring models*, `ocpigen` takes as input the XML description files, and generates these outputs:

**Instantiation Header File:** this read-only file is what is required for instantiating the worker in a container (or sometimes in test harnesses). It is required for some authoring models (e.g. HDL Verilog empty module or VHDL component) and not for others (e.g. RCC). It is sometimes also used in compiling the implementation (e.g. HDL Verilog, but not VHDL). It contains the definitions that containers require to instantiate the workers, and no more.

**Implementation Header File:** this read-only file is the basis for compiling the worker implementation source code, and has definitions beyond what is available from the Instantiation Header File, including various convenience definitions generated from the XML description files. It is required for compiling the worker, is not expected to be modified by the implementer, and contains as much many conveniently generated definitions as possible to minimize redundant and error prone efforts in the source code. Whenever possible, definitions, declarations, macros, etc. are defined here to minimize the lines of code in the Implementation Skeleton file (below), since that is edited by hand and any definitions are subject to hand editing mistakes and merge issues.

**Implementation Skeleton File:** this generated file is expected to be edited and contains a “skeleton” of the implementation, based on the XML description files. It is compilable as is, and is generally “functional” in most authoring models, meaning that it can be compiled and included in real applications, but will not do anything interesting or useful. This is generally helpful to test the integration into applications before the implementation skeleton is “fleshed out” with real functional code (a.k.a. “business logic”).

The **ocpigen** tool, for each authoring model, tries to put as much code into the implementation header file as possible since it is read-only and thus can be easily overwritten/regenerated by the tool. This minimizes the problems and challenges associated with changes in the XML description files (like adding a configuration property to an OCS), since the regenerated code will usually not invalidate the skeleton file (which has been edited by hand). In such cases any changes in the generated skeleton code (which are rare), can easily be compared against previously skeletons.

## 9.2 How **ocpigen** is used

The makefile scripts included in the CDK normally run **ocpigen** in the appropriate way with the appropriate options, but to run ocpigen as a command line tool separate from the CDK makefile scripts, the syntax is:

```
ocpigen [options] owd.xml
```

The owd.xml file is the worker description XML file, which contains or refers to an OCS (OpenCPI Component Specification) file. The options specify what output to produce as well as other options. All options are single letter flags following a hyphen. When options require a filename or other string, they follow as separate arguments, not directly following the flag “letter”.

The options that tell ocpigen what output to produce are:

- d Produce the *instantiation header file* (the “defs” file), which will have the same name as the OWD with the suffix “\_defs” followed by an appropriate file name extension for included files of the authoring model (e.g. “.h” for C code, “.vh” for verilog code etc.)
- i Produce the *implementation header file* (the “impl” file), which will have the same name as the OWD with the suffix “\_impl” followed by an appropriate file name extension for included files of the authoring model.
- s Produce the *implementation skeleton file* (the “skel” file), which will have the same name as the OWD with the suffix “\_skel” followed by an appropriate file name extension for the source code of the authoring model.
- A Produce the runtime metadata (“artifact XML” file), which will have the same name as the OWD with the suffix “\_art.xml”. This file is not used by itself directly but is attached to the runtime binary file that implements some workers.

Any number of these options can be used together and all the requested files will be generated. The other options which control ocpigen are:

- D **outDir**  
Specify the directory where the generated files are placed, rather than in the current working directory where ocpigen is being run.
- I **includeDir**  
Add a directory to search when finding XML files that are referenced from any XML input files (or files they reference). This option is commonly used to specify the location of “spec” files when processing OWD files, since they typically do not (and should not) be absolute or relative pathnames. This is similar to the same gcc option.
- M **dependencyFile**  
Specify a file to be produced for processing by “make” in which to put any dependencies discovered (via xi:include in XML files) or generated (when output files depend on input files). Such files are then referenced from

Makefiles (if the files exist – which they will not before the first run of make in that directory).

Several options only apply when artifact XML files are generated:

**-P *platform***

Specify the platform name that the artifact is built for. This is used at runtime to make sure the binary file is used on the appropriate platform.

**-e *device***

Specify the device (or “part”) name that the artifact is built for. This is used at runtime to make sure the binary file is used on the appropriate device.

**-D *outDir***

Specify the directory where the generated files are placed, rather than in the current working directory where ocpigen is being run.

Finally, several options are special operations for the HDL authoring model. These are:

**-l *libraryName***

Specify a library for VHDL or Verilog “defs” files that indicates where the entity declaration should be placed.

**-c *containerDescriptionFile***

Specify the name of an XML file that describes the (fixed) container interfaces when generating HDL artifact files. I.e. this option is required when the “-A” option is used for HDL workers. The contents of this file are defined in the [HDLAMR]. This is required when the –A option is used to generate the HDL artifact XML file.

**-b** Generate a BSV (Bluespec System Verilog) declaration corresponding to an underlying Verilog worker. This is used when integrating HDL workers written in Verilog into a container written in BSV. The output filename will have a “l\_” prefix and a “.bsv” file extension. [experimental].

**-w** Generate a cross-language “wrapper” file for the “defs” file. If the worker implementation language is Verilog, the wrapper file is VHDL and vice versa. [unimplemented]

**-a** Generate the “assembly” file, given an input XML file that specifies an HDLAssembly rather than an HDLImplementation. Thus the input is not really a worker description (OWD), but rather a composition of workers that represent a static configuration inside an FPGA design. The output is a Verilog file that instantiates the workers in the assembly, interconnects them (with required adaptations/tie-offset etc.), and creates the higher level “application” module that will be used to build a complete FPGA design bitstream.