



# Aspects of Embedded Systems Design and Methodology

Embedded Systems Design Course, 2020-12-02

Tiberiu Seceleanu

1

## Outline

- 1. • Embedded Systems: what ?
- 2. • Platforms
- 3. • Design: flow(s), challenges, solutions, results
- 4. • “New” technologies
- 5. • Recent research and advances
- 6. • Questions ?

2

## Embedded systems ?

- Computing systems are all-around
- Still, most of us think of “desktop” hardware and the related applications
  - PC’s
  - Laptops
  - Mainframes
  - Servers
- But...
  - Look into your pockets !



3

## Embedded systems

- Embedded computing systems
  - Computing systems embedded within electronic devices
    - That is, HW and SW !
  - Billions of units produced yearly, versus millions of desktop units
  - Tens per household
- Application areas:
  - Communication
  - Computer Peripherals
  - Industrial Control and Automotive
  - Consumer Electronics
  - Test and Measurement
  - Medical
  - Military/Aerospace
  - **ETC !!!**
- Hard to define. Nearly any computing system other than a desktop computer
  - Cell-phones
  - Automatic Teller Machine
  - The Digital Interfaced Gasoline Station
  - Airborne Flight Control System
  - Automotive Engine Health Monitoring System
  - Home Security Systems
  - Modern Air-conditioners
  - Washing Machines
  - Medical Equipment
  - DVD Players
  - Printers
  - Medical Equipment

4

## In general, embedded systems

- Are single / few - functioned
  - Execute a single (set of) program(s), repeatedly, or whenever requested / necessary
- Are highly - constrained
  - Low cost (**large volumes**),
  - Low power (**many operate on batteries**),
  - Small (**many are man-portable**),
  - Fast (**may require high performance**), **etc.**
- Usually are reactive, sometime also with real-time requirements
  - Continually reacts to changes in the system's environment
  - Must compute certain results with deterministic latency

5

## Process Automation Domain



6

## Case Study

### Wind Turbine Controller



7

## Case Study

### BLDC Motor



8

## Industrial Processing Platform



9

## Platform ?



10

## Vision



11

## Vision



12

## Things to be considered... (one out of very many !)



13

## What we will design What we can / want to design

The key to embedded systems design is gaining control of the interplay between computation and [...] constraints to meet a given set of requirements on a given implementation platform.  
Thomas A. Henzinger, Joseph Sifakis. **The Discipline of Embedded Systems Design**, Computer, October 2007, pp. 32-40.



14

## Selection?



15

## High level, un-specific pressure factors (non-exhaustive)



16

## Design process



17

## Y-Chart approach (and design refinement)



18

## Application ↔ Platform mapping



19

## One way to go - ACG



20

## A small example – a wind turbine controller What is the flow ?



21

## A small example

**Contents**

[Summary](#)

[Clock Summary](#)

[Resource Utilization Report](#)

[High-level Resource Report](#)

[Target-specific Report](#)

[Optimization Report](#)

[Distributed Pipelining](#)

[Streaming and Sharing](#)

[Target Code Generation](#)

[Traceability Report](#)

No

```

64 BEGIN
65 -- <S2>/Embedded MATLAB Function
66 u_Embbeded_MATLAB_Function : Embedded_MATLAB_Function
67 PORT MAP(u => Reference_Signal, -- int16
68 yPhase => selector - int8
69 ); 71 selector_signed <= signed(selector);
70
71 Reference_Signal_signed <= signed(Reference_Signal);
72 -- <S2>/Constant
73 Constant_out1 <= to_unsigned(16384, 16);
74
75 -- <S2>/Subtract
76 Subtract_sub_cast <= signed(resize(Constant_out1(15 DOWNTO 14), 16));
77 Subtract_out1 <= Subtract_sub_cast - Reference_Signal_signed;
78
79 -- <S2>/Constant
80 Constant1_out1 <= to_unsigned(32768, 16); 85 -- <S2>/Subtract
81 Subtract1_sub_cast <= signed(resize(Constant1_out1(15 DOWNTO 14), 16));
82 Subtract1_out1 <= Reference_Signal_signed - Subtract1_sub_cast;
83
84 -- <S2>/Multiport Switch
85
86 afterMux <= Reference_Signal_signed WHEN selector_signed = 1 ELSE
87 Subtract_out1 WHEN selector_signed = 2 ELSE
88 Subtract1_out1;
89
90 -- <S2>/Even Triangle LUT
91 Even_Triangle_LUT_sub_temp <= afterMux - 1;
92
93 Even_Triangle_LUT_k <= to_unsigned(0, 4) WHEN afterMux <= 1 ELSE
94 to_unsigned(15, 4) WHEN afterMux >= 16 ELSE
95 unsigned(Even_Triangle_LUT_sub_temp(3 DOWNTO 0));
96 solution <= not_to_integer(Even_Triangle_LUT_k);
97
98 Triangular_Wave <= std_logic_vector(solution);
99
100 END rt;
```

Traceability links

Is this simple enough?

22

## Compliance - original



23

## Compliance - changes



24

## Flow



25

Consider a HW design here...  
What is wrong?



26

26

## Demo Components

### Controller\_TestBench.m

- Defines the input state (types and values)
- Proposes a word length and a fraction point position
- fixedPointSetup.m**
  - Simulates the model
  - Provides "optimal" value for the fraction point
  - Applies this to all elements requiring it (product, sum, etc)
- Generating HDL code. Can be automated, too.
- Input: "file.mdl"
- Output: "file\_fp.mdl"

```

1 - clear, clc
2 -
3 - % Global settings
4 - Ts = 1e-6;
5 - N = 1e5;
6 - t = (1:N)*Ts;
7 -
8 - %% Double precision inputs
9 - nt = numerictype('double');
10 - interfaceSpec;
11 -
12 - %% Fixed point set up
13 - w1 = 18; % Input data type
14 - prodw1 = 18;
15 - accw1 = 24;
16 -
17 - % Automatic scaling
18 - fixedPointSetup('LineController',w1,prodw1,accw1);
19 -
20 - % Quantize input data
21 - nt = numerictype(1,w1);
22 - interfaceSpec;
23 -
24 - % Controller test fpDUT
25 - %% Generate HDL code
26 - dut = 'LineController_fp/DUT';
27 - tlang = 'vhdl';
28 - tdir = ['hdlsrc' ' ' tlang];
29 - makehdl(dut, 'targetLang', tlang, 'targetDir', tdir);

```

Matlab script

Global settings  
 Double precision inputs  
 Fixed point set up  
 Generate HDL code

27

## HW / SW Approach



28

Flow



© ABB Group  
December 2, 2020 | Slide 29

29

## A small example



30

## Results

| Device: xc6slx150t-3fgg484       |                        |              |                     |
|----------------------------------|------------------------|--------------|---------------------|
| Feature                          | Existing design (VHDL) | HDL-C: Sol.1 | Comments            |
| Number of Slice Flip Flops       | 120 (0%)               | 422 (1%)     |                     |
| Number of Slice LUTs             | 478 (0%)               | 1158 (1%)    |                     |
| Number of RAMB8/16BWERs          |                        | 2(1%)        |                     |
| Number of DSP48A1s (multipliers) | 2 (1%)                 | 21 (11%)     |                     |
| Frequency (MHz)                  | 64.5                   | 17.6         | 40/80MHz required ! |

31

## Flow



32

## Results

| Device: xc6slx150t-3fgg484       |                        |              |           |                     |
|----------------------------------|------------------------|--------------|-----------|---------------------|
| Feature                          | Existing design (VHDL) | HDL-C: Sol.1 |           | Comments            |
| Number of Slice Flip Flops       | 320 (0%)               | 422 (1%)     | 635 (1%)  |                     |
| Number of Slice LUTs             | 478 (0%)               | 1158 (1%)    | 1024 (1%) |                     |
| Number of RAMB8/16BWERs          |                        |              | 2(1%)     |                     |
| Number of DSP48A1s (multipliers) | 2 (1%)                 | 21 (11%)     | 13(7%)    |                     |
| Frequency (MHz)                  | 64.3                   | 17.8         | 34.4      | 40/80MHz required ! |

33

## Results

| Device: xc6slx150t-3fgg484       |                        |              |               |                     |
|----------------------------------|------------------------|--------------|---------------|---------------------|
| Feature                          | Existing design (VHDL) | HDL-C: Sol.1 | HDL-C: Sol. 3 | Comments            |
| Number of Slice Flip Flops       | 320 (0%)               | 422 (1%)     | 635 (1%)      |                     |
| Number of Slice LUTs             | 478 (0%)               | 1158 (1%)    | 1024 (1%)     |                     |
| Number of RAMB8/16BWERs          |                        |              | 2(1%)         |                     |
| Number of DSP48A1s (multipliers) | 2 (1%)                 | 21 (11%)     | 13(7%)        |                     |
| Frequency (MHz)                  | 64.3                   | 17.8         | 34.4          | 40/80MHz required ! |

34

## Coupling / Legacy



35

35

## Wrapper and code

```
entity fpga_top is
port(
    -- global signals -----
    i_reset      : in std_logic;      -- reset
    i_clk        : in std_logic;      -- clock
    i_clk_en     : in std_logic;      -- clock enable
    o_ce_out     : out std_logic;     --
    o_Blk        : out std_logic;     --
    -- input interface -----
    i_input_dram_waddr : in std_logic_vector(4 downto 0); --
    i_input_dram_data  : in std_logic_vector(31 downto 0); --
    i_input_dram_we    : in std_logic;      --
    -- output interface -----
    i_output_dram_raddr : in std_logic_vector(4 downto 0); --
    o_output_dram_q   : out std_logic_vector(31 downto 0) --
);
end entity fpga_top;
```

```
Controller_Instance: RC
port map(
    clk          => i_clk,
    reset        => i_reset,
    clk_enable   => i_clk_en,
    index        => input_reg(0)(15 downto 0),
    ce_out       => output_reg(0)(0),
    ref_angle_frep  => input_reg(1),
    .....);
);
```



36

36

## Flow



37

## Recap Flow



38

## Technology: Multiprocessing



39

## Technology: Multiprocessing



40

## On-Chip Platform, Multiprocessing, Migration



41

## Dual Core Usage – A Staged Approach



42

## HW/SW Partitioning

Gaetana Sapienza – "HW/SW Partitioning methodology for embedded applications using multiple criteria decision analysis", PhD thesis, MDH nr. 210, Nov. 2016.

Early start of design separation into hardware design flow and software design flow



### Underestimation of the partitioning decision process

(Unmature and hasty decisions, shadowed by model-based tool trend, inaccurate estimation of project and application constraints)

43

## Simplified Typical Development Process



44

## Requirements Elicitation

List of key requirements and project development constraints

- The maximum allowed execution time shall not exceed 30 microsec
- The power consumption shall not exceed 1W
- The design phase shall be completed within 3 months
- The entire project shall be completed within 6 months
- The application shall provide a Safety Torque Off (STO) function
- The application shall provide a Safety Limited Speed (SLS) function
- The overall system development cost shall not exceed the cost of XXX
- The application shall be developed by using a component-based design approach
- The application shall be developed by using a model-based design approach
- The application shall be developed by using Matlab and Simulink for design and implementation
- The application shall be developed on an heterogeneous platform (Xilinx Zynq ZC702)



45

## A Possible Approach ?



46

## Application Decomposition

### 1<sup>st</sup> - STEP of the partitioning decision process



47

## Overall Partitioning Decision Process Overview 5 – Steps Process



48

## Design refinement for final deployment – BLDC Motor



49

## HW/SW Implementation

The partitioning of the application was carried out based on the input from MultiPar



50

## HW/SW Implementation



51

## HW/SW Deployment

- The selected HW/SW deployment configuration implemented on Zynq platform
- The motor properly spinning
- Safety functions also implemented
- Requirements and constraints satisfied



52

## HW / SW Partitioning (HSP) + Technology Re-targeting (TR)



53

## Technology independence aspects

### Component based design



One way to address some aspects of the component based approach -

IP-XACT:

"an electronic data book; a description of components and designs written in a standard data exchange format (XML), which is both human readable and machine processable. It describes electronic system designs and the interconnection of IP interfaces in a standard specification to provide IP suppliers, design integrators and Electronic Design Automation (EDA) vendors, with a common representation. In this way, IP-XACT provides a mechanism by which design reuse can be made a practical reality."\*

54

## Flows. Legacy & New Designs



55

## Flows. Legacy & New Designs



56

## Kactus2: an IP-XACT tool



57

## Simulink vs. Kactus2



- Language
- Specification & Design
- Selection of platform(s)
- ACG ("HLS" ?)
- Integrated environments
- Legacy ?
- Versioning ?
- ...
- Include CPU ?
- "Place" SW on CPU ?
- Connect SW to HW ?
- How to build the separate projects ?
- Keep track of various potential versions of the SW / HW ?
- "Link" to Xilinx / Altera / etc. tools ?
- "HLS" potential ?
- ...

58

29

## BLDC Motor: Kactus2



59

## Kactus2 vs. VSI



```

1 #include "cabs.h"
2 void filter(uint16_t source, uint16_t dest, uint16_t mask, uint16_t val)
3 {
4     *HALL = *HALLS_0 + *HALLS_1 * 2 + *HALLS_2 * 4;
5 }
6
7 /* this function will be executed by the VxT runtime everytime
8 * any data arrives on either HALL_0 or HALL_1 or HALL_2 *
9 * void filter(uint16_t source, uint16_t dest, uint16_t mask, uint16_t val)
10 * {
11 *     Hall* address = &HALL[dest];
12 *     if(address == &HALL_0)
13 *         hall1 = *address <> 16;
14 *     static uint16_t hall0;
15 *     static uint16_t hall1;
16 *     static uint16_t hall2;
17 *     if(*HALL_0 == empty0) hall1 = *HALL_1.read0();
18 *     if(*HALL_1 == empty1) hall1 = *HALL_1.read1();
19 *     if(*HALL_2 == empty2) hall2 = *HALL_2.read0();
20 *     filter(*HALL_0, hall1, hall2);
21 *     Hall.write(hall);
22 * }

```

**Figure 9.** *FilterModule* description in HLS C.



60

## BLDC Motor: VSI



61

## Desired Features / Capabilities



62

## Dealing with interfaces - Visual System Integrator



63

## Dealing with interfaces - Visual System Integrator



64

Open for

## Questions

# Thank you !

65

## Recent developments\*

J areddihdfkde bñv#srhvñbz #fkdñqj hv#q#kh#  
p rghuñhp ehggghg#v|vhp v#HV, #ghyhar sp hqw#

- G liuhqwu#wdggdugv
- Yduñlpj #cshudlqj #frag lirqv
- G liuhqwu#fxvwp hñphhg

Wkhvh#fkddñqj hv#qwrgxfh#kh#variability suredp =

- Vriz duh#lqj ljhhuñj #vrowlrq#Vriz duh#  
Surgxfw#ljhv#VSO,
- Exw#VSO#l#Vriz duh oruhgvhg#hfkqlxh

Yduñlpj#q#kh#frqvlghhg#surhfw



**Features=**

- Vdhw#Dp lbg#Vshhg#VOV,
- Vdhw#IntxhR ii#VWR ,

Ip sdp hqwdlrcq#r#kh#VOV#dgg#WR #hsuhvhqwu#variants #  
z klh#VOV#dgg#VWR #whd#hsuhvhqwu#variation points +YS ,

\*A. Urvantsev, M. E. Johansson, N. Müllner, T. Seceleanu. *Experiencing Technology Independence*. IEESD, COMPSAC 2019

66

## Problem formulation

Sursrvhg#ghvlij q#orz =  
 ↳ Whfkqrarj | #qghshqghqfh  
 ↳ Yduble lkw  
 ↳ KVS  
 ↳ Ghshqghqf | #Uhvroylkj  
 ↳ Iqvuidflkj #Suredp

Uhvhdufk#xhwlrqv=

- **RQ1** #Krz #r#nqded#l#Whfkqrarj | #  
 bghshqghqf#ghvlij q#orz #q#kh#  
 hdaj#vdjh# #kh#ghvlij q#surfhv#  
 dqq#shuirp #kh#sdwlrqkj #  
 ghfivrq#surfhwv#l#l#l#l#l#vdjh#B
- **RQ2** #Krz #r#nqded#l#vwhp dw#  
 dqq#lifvlyh#surfhwv#xssruwlij #  
 ghvlij qhw#hiruh#lssdfdwlrq#  
 sdwlrqkj #dqq#urylh#Whfkqrarj | #  
 uh0duj hwkj B

67

## Limitations

Z h#lwurgxfh#l#Whfkqrarj | #qghshqghqf#ghvlij q#orz #  
 iru=

- Uhvroylkj #ghshqghqf hv
- E lqglkj #dulqw#r#h{ lwlkj #Sv
- Uhsafelkj #dulqw#iruh{ lwlkj #Sv
- Kdaggdjj #kh#KVS #surfhwv

P hkrgrarj | #holhg#l# lwdlrqv=

- Uhvxow#luh#g liifxo#r#j hqhuuddjh
- Wkh#wg | #Fryhw#l#l# lhg#xp ehu#irro#dagg#  
 whfkqlxhv#lssdhg#r#kh#j lyhq#surmfw
- Ip sdp# hqwdwlrq#holhg#l# lwdlrqv=
- Xvhg#irro#ghshqgv#q#[ lbg{ t#urgxfw/#kfk#  
 lwurgxfhv#l#hqgrughshqghqf | #exp
- [ lbg{ #Fryhw#B4 ( #kh#p dunhw

68

## KVS#dgg#qwhuidf bj

Kdugz duh Vriez duh#SduMrlqlj #KVS, =

- QS Okdug#suredp
- Vroghie | khxulwif /# hqhwf#dgg#Eoxwhujj # dj rulkp v
- Uhtxlhvif qwhujudwq#surfhhv
- Iqurqxfhv#**Interfacing** suredp v

Iqwhuidf bj #suredp =



Q srw# rxg# htxlh# Q fkdqj hv

69

## Ylydgr dgq#VVL

Ylydgr Ghvlijq#VxLh#KO{ =

- Vriez duh qh1bhg#S#frh#jhqhudwq# | #KOVF
- Eafn0edvhg#S0phnjudwq
- P rgh0edvhg#ghvlijq#qwhujudwq

Ylydd#V | vwhp #qwhj udwcu#YVI, =

- Turqwhqg#ru#Ylydgr
- Vroghvif qwhuidf bj #suredp #gxulqj #Okdqj bj #kh# h{hfxwlgf#rqrh{w
- Vsduw#d#hz #cq#d# jyhq#V | vwhp #q#e r#ailqv# V | vwhp #dgg#Saiwirp

70

## KOV

KOV OF =

- Dxwqrqrp rxv#udqvawrq# #D# Qedvhg#ghvfuswraq#  
lw#glj kdkdugz dhu
- Khsqv#r#dfkhyh#kh#ghvlhg#sdwirup 0  
lghshqghqfh
- J hghuawng#kdugz dhu#ghvfuswraq#ljp rvw#Edvhv#l#  
ixqfwirqda#fruhfw#exw
- □ #bxwqrqrp rxv#udqvawrq#liihfw#kh#shuirup dqfh

KOV OF #n{dp sdi=

```
void multiplier(uint21 *SPEED_IN,
                uint21 *SCALE_FACTOR,
                uint21 *MULT_OUT) {
    *MULT_OUT = *SPEED_IN * *SCALE_FACTOR;
}

void stream(hls::stream<uint21> &SPEED_IN,
            hls::stream<uint21> &SCALE_FACTOR,
            hls::stream<uint21> &MULT_OUT) {
    static uint21 speed_in;
    static uint21 scale_factor;
    static uint21 mult_out;

    if (!SPEED_IN.empty())
        speed_in = SPEED_IN.read();

    if (!SCALE_FACTOR.empty())
        scale_factor = SCALE_FACTOR.read();

    multiplier(&speed_in, &scale_factor,
               &mult_out);
    MULT_OUT.write(mult_out);
}
```

71

## • F olvvlfdo|R YP

Ghshqghqf |#n{dp sdi=

- L#WVR l#h{hfxwng#q#VZ wkhq  
SibFR QWUR OOHU vkrxg#h#h{hfxwng#q#VZ
- L#VOV l#hfxwng h#h#kh#surhf#khq  
SibFR QWUR OOHU vkrxg#h#h{hfxwng#q#KZ

R YP -R ukrjrqdoYduldehw P rghq ghvfulswraq=



72

## Ghyhørshgtarro

J X L



F rqiljxudwlrq#lb#n{dp sdi=

```
STO_PROTECTION
    context
    dependencies
        PI_CONTROLLER
    rules
        if STO_PROTECTION.context == 'SW'
            PI_CONTROLLER.context == 'SW'

SLS_PROTECTION
    context
    dependencies
        PI_CONTROLLER
    rules
        if SLS_PROTECTION.status == True
            PI_CONTROLLER.context == 'HW'
```

73

Open for

Questions

**Thank you !**

74