

# ~~27/10/2025~~ EMBEDDED SYSTEMS

VLSI PCs: many diff. tasks

dedicated functions; lim. resources

Tasks system should respond to I/P within specified time

3 types of Real Time Systems (deadline miss)

(1) Soft Real Time Systems: ~~deadline miss~~ is not catastrophic

(2) Hard: deadline very important; gaming



(3)

Firm: If deadline is missed we will just ignore it and move on.



SISD: Uni processor

SIMD, MISD, MIMD: Multiprocessor

Robotics? 100% parallelism

Interconnection Networks: Processor to memory

Linear bus: P P P P P



Adv.: recovery

Disadv: all buses do the same tasks again and again.

Features of Embedded System:

1. Special purpose; single programming, cannot be changed once chip is made.
2. Tight constraint of energy, cost, space, form factor
3. Receives input and produces result

Microprocessor which controls other subsystems is an embedded system

microcontroller based system to do specific functions, cannot be modified by end user.



Microcontroller: all components in a single chip  
Microprocessor: separate components like memory,

21/01/2025 Goal of embedded system: to perform its own functionality while designing;  $\rightarrow$  realise and

Non Recurring Cost, Unit Cost, Size, Performance, Flexibility, Maintain., Time to prototype, Time to market, Safety,



$$\text{Revenue loss} = \frac{1}{2}(2W-D)(W-D)$$

$$\hookrightarrow RL = \frac{D^2 + 3WD}{2} = \frac{D(3W-D)}{2}$$

$$\text{Revenue loss in \% : } \frac{\frac{D(3W-D)}{2}}{2W^2} \times 100\%$$

$$\text{Per unit cost} = \frac{CNRE + N \cdot C_{unit}}{N}$$

CNRE = Non recurring cost

C<sub>unit</sub> = Cost of each unit

€U

21/01/2025 in PC's RAM, ROM, I/O, external microprocessors, Components ALU, Registers, with help of bus: transfer, com. phases, microcontroller, microcomputer

Microcomputer: Memory  $\leftrightarrow$  Reg  $\leftrightarrow$  ALU  $\leftrightarrow$  I/O

Used in PC's, small program, less complex

Microprocessor: CPU on one chip, dedicated

Microcontroller: everything embedded, compressed, on one chip



Harvard & separate buses



Von Neumann: common bus

L1 Cache: separate memory for cache and L1  
L2, L3: place anywhere

Embedded system vs PC

program in memory, so separate ~~primary~~ secondary  
only some embedded have real time OS. (Not all E have OS) OS executes program

specification → system → Subsystem → Processor level design



Processor → Generic, system specific (like embedded) / Application specific (e.g. DSP, multimedia)

Microcontroller →

- counters
- timers
- D/A
- A/D

Models to develop the embedded system.

1. Big bang: we have to design everything, no specification is given.

2. Cold & fit: some requirements are given

3. Waterfall: output of one stage is given as its to next stage, there are diff stages.

4. Spiral: Each stage has feedback loop.





V-Model:

Requirement Analysis

Spec. Arch

Sys. design

SW Arch → HW design

Behavioral)

→ Structural



Geometric low-level

Processors

will be made for specific domain,

GPP (general purpose), FSP (app & syst.),

High end

Programmable

specific domain,

FSP (app & syst.)

High end



Central Paths



Data Path



Factors for Microprocessor

\* Speed \* Power \* Cost \* Size

\* Reduce time of a CC \* Reduce no. of cycles between an int.

\* Spec benchmark

\* Power

To evaluate performance of system.

Dynamic Power :  $\text{power} \propto dV^2$   
 while operating in system  
 $\text{dynamic voltage} = \text{frequency scaling} - \text{current} \propto dV^2$

Static Power : leakage power, when system is off

ASIP : uses microcontrollers (also used in embedded systems)  
 \* Some add ons like timers, clks, converted A/D  
 \* Has to read data from sensors

\* Event-driven.

\* Has program mem, data mem.

\* Specialised instruction set.

\* DSP Processors : multiply & add, vector instructions

why not general purpose processors?

But DSP does vector add "parallelly".

Design single purpose processor to execute this

$$a = 0$$

for  $i = 1$  to  $j$

$$a = a + N[i]$$



o vector  $x, y$   
 1 while ( $x \neq 0$ )  
 2 while ( $y \neq 0$ )  
 3  $x = x - y$   
 4  $y = y - x$   
 5 until ( $x = y$ )  
 6 if ( $x \neq y$ )  
 7  $y = y - x$   
 8 else  
 9  $x = x - y$   
 10 do  $x = x / 2$

single purpose processor

Control path  
and  
Data path



This is hardware

For Q, the FSM is

clocking with  $x, y$  etc



GCD :

```

while (cond)
    loop statements;
}

while (c1 & c2) {
    if (c3) {
        loop statements;
    }
}

```



FSMD : Finite State Machine with Data

Data path :  $a + b * c$

contains registers

and ALU

MUX becomes  
X can come  
either from Reg  
or line 8

X load  
Signal



The signals  $x$  and  $y$ ... go to control path.

control path



| I/P     | $N_3$ | $N_2$ | $N_1$ | $N_0$ | $x < y$ | $x \neq y$ | $g_0$ |
|---------|-------|-------|-------|-------|---------|------------|-------|
| 0 0 0 0 | X     | X     | X     | X     |         |            |       |
| 0 0 0 1 |       |       |       |       |         |            |       |

| I/P     | $N_3$ | $N_2$ | $N_1$ | $N_0$ |
|---------|-------|-------|-------|-------|
| 0 0 0 1 |       |       |       |       |

In the FSMD,  
the SR is  
state register  
which takes in  
the current  
state and gives  
next state.

1 is 0000  
2 is 0001  
like this for each  
state, addn 4 bits  
are the transitions



$\hookrightarrow$  10 states = 4 flipflops  
6 states = 3 FF



|      | AND | OR |                  |
|------|-----|----|------------------|
| PROM | E   | P  | { : fixed        |
| PLA  | P   | E  | { : programmable |
| PAL  | P   | E  |                  |
| GAL  | P   | E  |                  |



Q. Sender and receiver, sender design. Control path and data path?

we can optimise the lines 5 to 9  
as  $x = y \oplus y$   
 $x = y$ , } ①  
 $y = y$ ; } ②

Earlier there were 9 iterations for Ad, now there are just 5.

- Optimise:
- ① Change Code.
  - ② Modify FSMD, reduce states.
  - ③ Data Path.

Have a single adder instead of 2:  
Instead of  $x + y$  or  $x - y$   
One will happen at a time.  
OR Use ALU.

PLD: programmable logic devices!

Simple can modify logic  
SPLD  $\rightarrow$  PROM, PLA  
complex CPLD

FPGA = IC.  
Programmable  
generally  
reusable

(IP) Intellectual property

Control logic and datapath  
for down timer  
verilog

Register 1: 00000000  
Register 2: 00000000  
Register 3: 00000000  
Register 4: 00000000



Interconnection networks:

- \* Antifuse: non programmable, separate voltage? In
- + SRAM: programmable

connection b/w CLB : set the bit at the pos - P 0 0 0 0 means it has connection.



~~02/07/2023~~ letter.xml and tree



### XML Validation:

\* Validate XML document: 1. DTD

2. XML Schemas

#### DTD:

\* enables the XML parser to validate the given XML document

\* verifies the syntax of XML doc

\* uses EBNF grammar

\* file contains declared: idtd rules to verify XML syntax  
letter.dtd

<!-- DTD for letter.xml -->

<!ELEMENT letter (contact\*, salutation, paragraph+, closing, signature)>

<!-- define individual & child elements -->

<!ELEMENT contact (name, address, city, zip, phone)>

<!ATTLIST contact type CDATA #IMPLIED>

Occurrence Indication: + (in attr, \*: 0 or more, ?: one)

B element name

attribute of contact

char

data

process the data

simply pass data to the application

#REQUIRED

#FIXED

```

<!ELEMENT name (#PCDATA)>
<!ELEMENT address (#PCDATA)>
<!ELEMENT city (#PCDATA)>
<!ELEMENT zip (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
<!ELEMENT salutation (#PCDATA)>
<!--
    paragraph
    Closing
    signature
-->

```

passed character data: only text, doesn't contain markup

To validate the XML document

[http://www.w3schools.com/dom/dom\\_validate.asp](http://www.w3schools.com/dom/dom_validate.asp)

[http://www.w3schools.com/xml/xml\\_validator.asp](http://www.w3schools.com/xml/xml_validator.asp).

Consider the following XML tree:

Employee Database



Create an XML document for this XML tree with 5 no. of employees (emp.xml)

Also write an external DTD (emp.dtd) to validate emp.xml.

05/02/2023 ARM : Reduced Instruction Set Machine

- \* 32 bit & 16% of 32 bits for
- \* low power consumption
- \* good performance

### Design :

- \* Small
- \* low power
- \* High code density (code stored takes less memory after optimizing)
- \* Interface with slow & low cost memory system
- \* Reduced die size.

ARM-I : 3 stage pipeline, F D E

ARM9 : 5 stages, F D E M W

ARMII : 6 stages F P D E M W

### Design features (RISC) → RISC

Instruction : Reduced set / single cycle / fixed length

Pipeline : Decode in one stage, converts inst in

Reg : General Purpose Regs. microinstruction.

Load & store architecture, why? because memory th.

CISC needs multiple RISC archs!

ARM is not pure-RISC.

### ARM : 2

- \* Variable cycle, multiple registers, load(store)
- \* In-line barrel shifter : a h/w unit to perform shift and some other op in same cycle.
- \* Thumb : 16 bit
- \* Conditional execution.
- \* MOVEQ R<sub>i</sub>, #0 add if zero is ?  
It will move if eq else it just does.  
like in one inst itself checking (EQ) and doing  
else (MOV)
- \* SDP Enhanced ins. Scalable data processing?
- \* No separate I/O ports



$$\text{Throughput} = \frac{\text{no. of ops per sec}}{(n-1+K) \times T}$$

Efficiency : Speedup  
 $\frac{nK}{K-1+n}$  Theoretical max speedup  
 $\frac{K}{n}$  (K stages to exploit  
 Latches : for true  
 states  $V_m$  res.

Max stage delay:  $\tau$ : clk period,  $t_i$ : time delay in stage  $i$ ,  $d_i$ : latch delay. Max stage delay:  $\max \tau_i$

Total time to process  $N$  data sets:  $T_{FL} = N \tau$

For EPL speedup =  $\frac{T_{FL}}{T_{NIFL}} = \frac{1}{(k-1)+N} \cdot \frac{1}{\tau}$

for pipelined:  $S_k = \frac{\tau_i}{T_k} = \frac{NK\tau}{((k-1)+N)\tau}$

$T = \sum$  sum of times  
of all stages

$\tau$  = time for one stage.



Processor Mode: 7 Modes:

- \* User mode
- \* High priority mode (interrupt highest prio)
- \* Low priority mode
- \* Supervised mode
- \* Abort mode
- \* Undefined (trying to exec an undefined instruction)
- \* System (privileged mode).



Save reg while an interrupt comes.

No stack pth, no push/pop op.

Interrupt : move data from CPSR  $\xrightarrow{\text{CPSR}} \text{SPSR}$   
finish disturbance, then move State SPSR  $\xrightarrow{\text{State SPSR}}$  Co.  
No Stack ptr in ARM, (but we can create it)  
only 16/37 reg are visible (Reusable)  $R_0 - R_{12}$

R<sub>13</sub>: to update our own stack ptr, gp

R<sub>14</sub>: link reg.

R<sub>15</sub>: PC and CPSR.

6 Data types. (Signed, unsigned)  $\times (8\text{b}, 16\text{b}, 32\text{b})$   
Can access only lower order bits.  
Only for data ops, not for arithmetic ops.  
Arithmetic : 32 bit, only ?

ARM mode : 32 bit, word accessible last 2 bits 0

Thumb mode : 16 bit, half word accessible always.

ARM mode : Thumb mode:

0000  
0100  
1000  
1100

0000  
0010  
0100  
0110  
1000  
1010  
1100  
1110

CPSR :

IT | Z | C | V | Q |

11111 : sys

11011 : undef

10000 : USR

10001 : FIR

10010 : IRQ?

10011 : super

10111 : abort

IRQ/FIR/T | 1 | 3 | 2 | 1 | 0

Interrupt Reg

Mode

0 - ARM  
1 - Thumb

0 - disable  
1 - enable.

Interrupt : load as mask to SPSR

Make FIR as 1?

Make IRQ as 1? high priority

Unit Reg: do 8000 between add.

27/04/2025 Tuples of ins? in ARs

\* Data Processor  
only registers  
arithmetic

Data flow int carry.

\* Data Transfer  
page and main

\* Control flow  
P updated.

add 10, 11, 12 : 10 = 11 + 12 10 = 11 + 12 C

LSB 10, 11, 12 :  $z_0 = 1_2 - 1_1$

NSC 10, 11, 12 :  $z_1 = 1_2 \oplus 1_1$

add, bits are normal.

bitwise op: and, or, xor, BIC (bitwise AND)

BIC {S3} {cond3} Rd, Rn, operand clear the lower ones  
- To set condition bits of Rn

POP-Pop: MOV 10, #2 Rd  $\leftarrow$  Rn AND operand

MUL 10, 12 = 10 \* not 12

negation

Comparison: CMP 11, 12 uses 11-12

CMP : 11, 12, - (11 < 12 ??) compare negative

1st 11, 12 th and 12

flag - N, D, Z, V, C, N2

Arithm. data:

AND 11, 12, #2

SUB 11, 12, #2

ADD 11, 12, #2 OF

S(ND)?

On last ins, 2nd is copied to R6;

hex

Shift:

ADD 11, 12, 13, LSL #3

$x_1 = x_2 + (x_3 \ll 3)$ ,

$x_1 = x_2 - 2$

POP :



R P



rotate N flag

carries

This flag

is used as

borrow bit

for the result

or -ve number

A Sh

Mathmatic shift right w/ the number as -ve number

31 +ve 0

31 -ve 0

10001 ... 1 0

31 1111 ... 1 0

## Multiplication

MUL R1, R2, R3, . . . R1 = R2 \* R3

MLA R1, R2, R3, R4, . . . R1 = R2 \* R3 + R4

Multiples and Arguments

address relative

ADDL R1, Table: Table's addrs is calc and loaded into R1.

LDR R0, [R1] R0 ← R1 BTR: R0, [R1]

LDR R0[R1, #4] R0 ← R1 + 4 like subtracting

Auto Indexing: R0 ← R1 + 4 then

LDR R0, [R1, #4], R1 = R1 + 4 R2 b

R0 ← R1 + 4 loaded into R0

Post Indexing:

load LDR R0, [R1], #4 first load R0 ← R1, then add R1 ← R1 + 4

LDRB 8 bits

LDRSH 16 bits

LDMIA R1, {R3, R5, R6} Load mem inc after  
R1, {R2 - R5}

$$R_3 = [R_1], R_5 = [R_1 + 4], R_6 = [R_1 + 8]$$

LDMIB before



LDMDB

diff b/w im

ins explain

dis ins

wher ins

wit PC

of code

P. Copy block of 128B write part of code  
wth 8 registers, in one time we can copy 32B  
we need to exec 4 time 4B(32B) reg. 8 \* 4 = 32B

load from, store into mem. Comp if it is the end

8/02/2025 (Lecture 10) New Class:  
 + Unconditional) \* (Conditional) (Branch & Link  
 \* Conditional exec)

Branch & Link & go main to the subroutine  
 else to SPSR while saving the current PC (addr of  
 next ins) on the link reg  
SUB

MOV PC, R14: After finishing subroutine, R14 is  
 link reg, has X.add (next ins) so PC  $\leftarrow$  R14 instead

Conditional Execution:

$$\text{If } (A_1, b = 5) \quad A_2 = A_2 + 5 - A_3.$$

breq  $\leftarrow$  Branch  $\rightarrow$  CMP, A, #5  $\leftarrow$  (Beg loop outside), this comp  
 add A<sub>3</sub>, R<sub>2</sub>, #5  $\leftarrow$  in ARM,  
 sub A<sub>2</sub>, A<sub>2</sub>, A<sub>3</sub>  $\leftarrow$  SUBNE A<sub>2</sub>, A<sub>2</sub>, A<sub>3</sub>  $\leftarrow$  in ARM,  
 outside:

ARM Branches

BAL

BREQ BNF

BPL

PMIS

BCC

BCS

BVC

BVS

BGT

BGE

BIT

BLE

Branch

R

- result true or false

- called set or clear

Overflow

GT, GE, LT, LE

NE, NEQ, EQ

CC, CCZ, CCN

W/ARM: MIPS

1. cmp \$t1, \$t3  
branch neg.  
cmp \$t5, \$t0  
branch neg  
add
2. add \$t1, \$t2, \$t3  
add \$t1, \$t3, \$t4  
add \$t1, \$t4, \$t5  
add \$t1, \$t5, \$t6
3. nested call. subroutine calls another subroutine

ARM only 3 lines?

comp \$t1, \$t3  
comp \$t5, \$t0  
~~Comp eq~~ \$t5/\$t0  
add \$t1, \$t7, \$t10

conv path  
data path  
state table  
C → MIPS  
Pipeline  
Speedup, efficiency  
Cost non-linear

01/02/2025 Types of DAC (Digital to analog converter)

Vector register? DAC, ladder DAC

1. weighted resistor:



$V_o = I_o \cdot R_f$  (Reference voltage),

$$I_o = d_1 + d_2 + \dots + d_n$$

$$I_o = \frac{V_R}{2^1 R} d_1 + \frac{V_R}{2^2 R} d_2 + \frac{V_R}{2^3 R} d_3 + \frac{V_R}{2^4 R} d_4$$

$$I_o = \frac{V_R}{R} \left( \frac{d_1}{2^1} + \frac{d_2}{2^2} + \frac{d_3}{2^3} + \frac{d_4}{2^4} \right)$$

0/V voltage:  $V_o = I_o R_f = R_f \frac{V_R}{R} \left( \frac{d_1}{2^1} + \frac{d_2}{2^2} + \frac{d_3}{2^3} + \frac{d_4}{2^4} \right)$

$$= V_R \left( d_1 \times 2^4 + d_2 \times 2^3 + d_3 \times 2^2 + \dots \right)$$

then  $V_o = V_{FS}$  scale (e.g. 0-8V means  $V_{FS} = 8V$ )

e.g. Take  $d_4 d_3 d_2 d_1$  as 0010

$$\text{then } V_o = V_{FS} \left( 0 + \frac{1}{2^2} + 0 + 0 \right) = \frac{1}{4}$$

$$\text{if } V_{FS} = 8V, \quad V_o = \frac{8}{4} = 2 \quad (\text{which is } 0010_2)$$

problem: keep the number of bits weighted resistor only, were for 8 bit, voltage is beyond that capacitors it won't work.

2) 2R ladder DAC



for binary 100

(4)

1:  $V_f$

0:  $\frac{1}{2} V_f$



A is also connected to GND since that is ok.

$\Rightarrow 2R, R$  in 11th



$$\text{so } V_B = \frac{-V_f + \frac{2}{3}V_f}{\left(2 + \frac{2}{3}\right)} = \frac{-V_f \times \frac{2}{3}}{\frac{8}{3}} = -\frac{V_f}{4} \quad (\text{why } ?)$$



$$V_o = \frac{R_f}{R_1} \times V_{in}$$

$$\text{so } V_o = \frac{2R}{2R} \times \frac{-V_f}{4} = \frac{V_f}{2} = \frac{1}{2} = 4???$$

for more bits (like 8  $\Rightarrow$  4 bit so add one more resistor).

### Analog to Digital

Flash Type:



$V_a$  = analog voltage.

no. of comparators needed is  $2^n - 1$

: these  $\Delta$  things.

Here in  $X_0$ , we connected  $\Delta$  so there is  $8 \Delta$ , but if can be  $DIN$  (thus connected to GND also) so only 7 is required. (no need for 8)

$$2^n - 1 = 8^3 - 1 = 7$$



Flash Type ADC



|    |    |    |
|----|----|----|
| 00 | 01 | 00 |
| 00 | 1x | 01 |
| 01 | xx | 10 |
| 1x | xx | 11 |

No. of comparators:  $2^n - 1$  are required

→ Here we have 8 but 7 also OK we can connect  
this has done like a GND (from GND?)

At o<sup>th</sup> pin is  $\frac{\text{total voltage} \times \text{resistor}}{\text{total resistance}}$

Eg. take  $V_R = 8$  for  $V_o = 1$  the bottommost



if + > - then o/p is 1 else 0

for other op-amps, like the second from bottom,  $V_o = 1$  if  $1 < 1 < 1$

$V_o = 1$  if  $1 < 2 < 2 \dots$  like that so

$\frac{2V_R}{8}$  if will be  $x_2 x_3 x_0$   
for  $(0x - V_R)$ ,  $0 \dots 001$

Now if we take  $V_R = 2$ , it will be 2 if  $2 < 2 < 1$   
and  $V_R = 1$  if  $2 < 2 < 2$

We can see that  $\frac{V_a}{V_d} = \frac{1}{8}$  is 00.011

SAR : (Some other type of ADC).



$$\text{Ex. } n=3 \\ \text{take } V_a = 13$$

whenever  
comp o/p is  
0, register  
becomes  
0 and its next  
(higher) 0 becomes  
1. of course  
1 becomes

| $d_3$ | $d_2$ | $d_1$ | $d_0$ | $V_d$ | comp o/p | $V_a > V_d$           |
|-------|-------|-------|-------|-------|----------|-----------------------|
| 0     | 0     | 0     | 0     | 0     | 1        | $13 > 0$              |
| 1     | 0     | 0     | 0     | 8     | 1        | $13 > 8$              |
| 1     | 1     | 0     | 0     | 12    | 1        | $13 > 12$             |
| 1     | 1     | 1     | 0     | 14    | 0        | $13 > 14$             |
| 1     | 1     | 0     | 1     | 13    | 0        | $13 = 13$ so prev val |

start from msb, by setting each bit

$$\text{Ex. } V_a = 11$$

| $d_3$ | $d_2$ | $d_1$ | $d_0$ | $V_d$ | comp o/p | $V_a > V_d$ |
|-------|-------|-------|-------|-------|----------|-------------|
| 0     | 0     | 0     | 0     | 0     | 1        | $11 > 0$    |
| 1     | 0     | 0     | 0     | 8     | 1        | $11 > 8$    |
| 1     | 1     | 0     | 0     | 12    | 0        | $11 < 12$   |
| 1     | 1     | 0     | 1     | 10    | 1        | $11 > 10$   |
| 1     | 0     | 1     | 1     |       |          |             |

PWM

: modulate the width of a digital signal's pulses



$$= \frac{t_{ON}}{T} \times 100\%$$



Answers with much time it is ON).

$$\frac{1}{T} \int f(t) dt = \text{avg.}$$

$$= t_{ON} V_H + (1 - t_{OFF}) \cdot V_L$$

$V_L = 0$  usually  
so

$$= t_{ON} \cdot V_H$$

power efficient, motor control, speed and torque  
control, adjust LED brightness

11/03/2025  
 Q) If op. op = 5V resp. OP AMP is ideal, off  
 the resistance, 5V i/p have tolerance  $\pm 10\%$ .  
 Spec for the tolerance of DAC is

4 bit DAC is



Q)  $V_R = 10V, R = 10K\Omega$

The current is \_\_\_?



Q) Same as 2, find voltage.



Find  $V_R$ ?

$$V_{Rf} = \frac{1}{2^4} \cdot 10V = 0.625V$$

Q) Design 2-bit ~~to~~ flash ADC using logic gates

### 11/03/2025 Scheduling Algorithms

Part monotonic scheduling algorithms in embedded sys

\* Static scheduling

\* Higher priority to tasks with shorter period

\* Hard-real-time system

Key features:

- \* static priority assignment
- \* pre-emption & schedulability
- \* deterministic feasibility
- \* optimal for fixed priority

Schedulability Condition:

$$\sum_{i=1}^n \frac{C_i}{T_i} \leq n(2^n - 1)$$

Max CPU usage

| when           | task | period (P) | capacity (C) | priority (P) |
|----------------|------|------------|--------------|--------------|
| T <sub>1</sub> | 3    | 20         |              |              |
| T <sub>2</sub> | 2    | 5          |              |              |
| T <sub>3</sub> | 2    | 10         |              |              |

1. LCM(20, 5, 10) = 20  
 priority T<sub>2</sub> > T<sub>3</sub> > T<sub>1</sub>  
 p.no is higher for shorter period



if CPU fails, cannot schedule anymore.



cannot schedule (dead line miss)  
 needs more record, but deadline (D) is given

Advantages: \* simple, easy to implement

- \* predictable timing behavior
- \* optimal for fixed priorities schedules

Disadvantages: \* limited CPU utilization & strictly periodic tasks required \* priority inversion is possible

App: robotics, automation

12/08/2023 Power Aware Systems

$$E = \int P dt$$

static

dynamic

power

quiescent state

$\downarrow I_{LP}$  leakage (short)



switching current

$$10^{10} / 10^9$$

$$E = C_L f V_{DD}^2$$

Supply voltage  
free voltage

Gated clock: disable clk signals

Power gating: disconnect power supply to parts of chip when not needed, saves both power & dynamic usage

2.5V

slow

Standard 4V

Fast

5V

Reduce short

\* Power gating

\* Body biasing

\* Multi Threshold CMOS

① 10<sup>9</sup> cycles @ 50 MHz

= 10<sup>9</sup> with 5V,

= 10<sup>9</sup> x 40 x 10<sup>-9</sup>

= 40 J

Dynamic

\* Red supply freq

voltage

MHz

\* Red capacitance

Clock gating

CT ms

\* DVFS scaling

Vdd

Energy

J

5V

25

10

4V

25

10

0.5V

25

10

dynam. &  
halog. power

usage

reduces

dynamic

consumption

usage

reduces

dynamic

$N_p$ : the number of tasks where  $1 \leq i \leq L$

f: CLK freq for  $S$ ?;  $V_{Vi}$ : supply voltage

$d = d_L$  global deadline,  $SC_j$ : switching  $Q$  and  $C$

$X_{ij}$  = no of CLK cycles task  $j$  to be executed at  $V_i$

ILP form:

$$EDF \leq f V_{dd}^2$$

$$\sum_{i=1}^L \sum_{j=1}^N [SC_j \times X_{ij} \times V_i^2]$$

$$H_j = \sum_{i=1}^L X_{ij} = EC_j$$

$$\sum_{j=1}^N \sum_{i=1}^L X_{ij} \leq \text{deadline}$$

clock frequency at voltage level  $i$ :  $f_i$

reducing energy consumption

2. Static voltage scheduling, dynamic voltage scheduling

AT of all known  
SS & DV, DS & DV  
AT of all not known



18/03/2025 compiler?

20/03/2025 EDF: Earliest deadline first

$T_1(4, 1)$ ,  $T_2(5, 2)$ ,  $T_3(7, 2)$ , sum period = deadline,

$T_1$  at  $t=4$ ,  $T_1$  deadline is 8.

$T_2$  at  $t=5$ ,  $T_2$  deadline is 10.

$T_3$  at  $t=7$ ,  $T_3$  deadline is 14.

$$\frac{1}{4} + \frac{2}{5} + \frac{2}{7} = \frac{35+56+40}{140} < 1$$

Condition:  $\sum_i \frac{C_i}{P_i} \leq 1$  necessary & sufficient



$T_2$ : intermediate  
 $T_2$  preempts  $T_3$ .

(CS: critical  
task & etc)

Priority Inversion  $\Rightarrow T_1 > T_3$  but here  $T_3$  executes  $\rightarrow$  then?



9 intermediate tasks  
higher priority?  
Unbounded priority inversion.

Disable preemption: waiting time will be lesser.



$T_3$  is not doing critical,  
it is soon next job



$P_A(T_1) = 5$



$P_A(T_1) = 10$

hard Realtime  
systems



① Deadlock

Inherits priority  
 $P_A(T_2) = 10$

① ② Priority inversion



Chain Blocking



$T_1$ : Block  
 $T_2$ : execute

$T_1$ : execute  
 $T_2$ : block

w-CR<sub>1</sub>

$T_2$  ex

$T_1$  blocks

$T_1(2,2)$   $T_2(3,3)$   $T_3(1,1)$   $T_4(7,7)$   $T_5(3,4)$   
 $T_6(4,1)$ ,  $(6,2)$ ,  $(8,3)$

### Pipes Priority Inheritance Protocol



### PCP? Priority Ceiling Protocol



$$C(S_i) = C(S_j) = \text{Priority}(f_i)$$

- Each resource is assigned a ceiling priority determined by the highest priority among all processes which may acquire the resource.
- An OS variable denoting highest ceiling of all locked semaphores is maintained.

### Difference b/w PIP and PCP

PIP: Resource is greedy. If the resource is free, it is allocated. Not greedy. Resource may not be alloc. to req even if free.  
 PCP: Resource is non-greedy. If the resource is free, it is not allocated. Resource is freed before another process enters.

Casier to implement

Blocking time =  $\sum$  duration of critical section time in critical section

deadlock chain blocking

Blocking time a larger critical section where low priority tasks

PL1, CSC = current system ceiling  
 1. CSC =  $\max\{\text{ceil}(CR_i)\}$  if  $CR_i$  is currently in use.  
 At system start CSC is initialized to 0.

d. Resource sharing among tasks in PCP is regulated by 2 protocols/rules:

- (a) Resource Request rule
- (b) Resource Release rule

(a) Request rule has 2 clauses:

(i) Resource grant clause, applied when a task requests a resource.

(ii) Inheritance clause: applied when task is made to wait for a resource.

(b) Resource release rule:

check the ceiling for priority based on they release the resources.



PCP

\* prevents deadlocks, chainblocking, unbounded fairness, inversion

| task name | T   | priority |
|-----------|-----|----------|
| A         | 50  | 10       |
| B         | 500 | 9        |

|                       |                         |                       |                         |
|-----------------------|-------------------------|-----------------------|-------------------------|
| task A                | lock(S <sub>1</sub> )   | task B                | lock(S <sub>2</sub> )   |
| lock(S <sub>2</sub> ) | unlock(S <sub>1</sub> ) | lock(S <sub>1</sub> ) | unlock(S <sub>2</sub> ) |

$$\text{ceil}(S_1) = \text{ceil}(S_2) = 10$$

|                         |                         |
|-------------------------|-------------------------|
| unlock(S <sub>1</sub> ) | unlock(S <sub>2</sub> ) |
| unlock(S <sub>2</sub> ) | unlock(S <sub>1</sub> ) |

J requests R in PCP

- \* R busy  $\rightarrow$  J blocked, req denied, unlock(S<sub>1</sub>)
- \* Priority of J  $>$  CSC  $\Rightarrow$  alloc else: unlock(S<sub>2</sub>)
- \* J holds some other res with priority ceil = CSC then req alloc else denied



## Types of Controllers:



P-D Controller

$O/P \propto I/P$   
 $O/P \propto \text{derivative}$

SP

derivative sig, D predicts error trend and gives damping

reduced overshoot

Steady state error is there mechanism

PID : Proportional Integral Derivative



Response time is less  
No overshoot  
No steady state error

Be future  
? past

**CT-D**  
ADC, DAC  
Scheduling  
Power

to accumulates error and takes corrective measures.

| <u>Q</u> | <u>E<sub>MA</sub></u> | P | CT | <u>E<sub>DF</sub></u> | P  | CT |
|----------|-----------------------|---|----|-----------------------|----|----|
| T1       | 2                     | 1 |    | T1                    | 4  |    |
| T2       | 3                     |   | 1  | T2                    | 6  | 3  |
| T3       | 6                     | 2 |    | T3                    | 10 | 4  |

Q tasks have high priority and low, that share a resource. Low pri acquires the resource, H becomes ready. However, before L releases the resource, a medium pri task M becomes ready in front.

L. What is the problem? How to prevent it?

Priority Inheritance or Inversion 8th.

- Q. 3 tasks:  $T_1$ ,  $T_2$ ,  $T_3$ .  $T_1, T_3$  both need to access a shared resource.  $T_3$  acquires the lock,  $T_2$  becomes ready, preempts  $T_3$ . Later  $T_1$  becomes ready, tries to acquire the lock. But it is blocked.  $T_3$  is still holds L. Explain the problem and solve it.

~~11/04/2025~~

Compiler, interpreter, preprocessor, middleware we need diff types of testing.

↳ Interface b/w high level, low level



Middleware <sup>elements</sup>: layers: MDM, ONS, RPCs, DB access, N/w protocols

Type: general purpose, market specific

OSI, JVM

MS for TV

Data link layer

PDP protocol in OSI LLC/NW

→ encapsulate, decapsulate \* synchronous and async

phases of 8th?

Init, dead, establish, auth, N/w layer, term, release

SMIP: MTA, MUA

~~15/04/2025~~   
?  $\rightarrow$  Node

OSI layers

Phys: mech, electrical

DL: Reliable data transfer  $\xrightarrow{\text{transport}}$ : end to end  
segment: connectionless: session: window dialog control

Presentation : data format, Application end-to-end  
System Architecture : User Interface

\* Topology  
\* Routing

Multistage N/W =  
multi-bus

\* Schema of communication



Cross path topology

→ Data-Push Programming

Based on off controller + triggers  
continuously fed data, some action  
controller doesn't receive.  
Eg: heater, submarine



Automobile, serial, high speed, half duplex, high  
end application : short messages.

Control Area Network : defines standards for phys & DLL layer

Transmitter: sends the message

Message : \* fixed format \* diff lengths (limited)

Rowing : identifier → tells what type of info is carried  
type of action that has to be taken, sent to all  
processing elements

Multicast, Priorities, Remote Data Request

(CAN uses static priority) → one node requests connection,  
another node acts as transmitter

Physical layer:

CAN - high must support bit dominance

CAN - low

Nodes operate on differential voltage

BAL Encoding



bit stuffing



SOF Arbitron Cewnd Data (CRC) ACK (EOF)  
↳ Identifier  
↳ Dominance

Data?

Remote frame: 7 bits

Receive bit → ?

CSMA/CD + AMP Collision → Dominance bit 0 → busy  
1 → free

### Error Control

Remote frame with transmitted on bus

→ error detection, error signalling taken care by CAN controller  
error frame, all messages are stopped (halt),  
fault recovery, message retransmitted.

Bit error, message error  
(bit stuffing, sequence is 11111101)

Bit error: error in the data while transmitting?

→ CRC error, Frame error, ACK Error/did not receive any ACK from the bus

### CAN Controller Modes

|                       |                                     |                                        |                                 |
|-----------------------|-------------------------------------|----------------------------------------|---------------------------------|
| Normal error          | Nodes can't transmit msg            | Error active, Error-passive, Bus off   | executes serious issues / reset |
| active flag high prio | dominance field <del>high</del> low | Error at active low priority pair flag | Dominance field high            |

