

## Assignment 2.

- Q1 Why associative memories are called a contents addressable memories?
- i) A memory unit accessed by content is called as associative memory or Content Addressable Memory (CAM).
  - ii) Associative memory is accessed simultaneously & in parallel on the basis of data content rather than by special address of or location.
  - iii) This memory is capable of finding an empty unused location to store the words.
  - iv) When a word is to be read from associative memory, the context of word is specified or part of the word is specified.
  - v) When word is written in associative memory, no address is given.
  - vi) The memory locates all words which match the specified content & marks them for reading.
  - vii) Because of its organization, the associative memory is uniquely suited to do parallel searches by data association

It is widely used in database management system.

Q2 Associative memory is expensive than RAM.

(Logic Circuit)

(Neural Network)

Q3 It is used where search time is very critical.

a.2. Draw the architecture of associative memory array, what is  $Rij$ ? How associative memory works?

*Hardware organization of associative memory*



Associative memory of  $m$  words,  $n$  cells per words:



**Argument Register:** It contains words to be searched.

(A) It has ' $n$ ' bits (one for each bit of word)

**Key register (K):** It provides mask for choosing a particular key in argument word.

- It specifies which part of the argument word needs to be compared with words in memory.
- If all bits in key register are 1's, the entire word should be compared otherwise, only the bits having 1's in their corresponding position are compared.

**Associative Memory:** It contains the word that are to be compared with the argument word in parallel.

- It consists of ' $m$ ' words with ' $n$ ' bits per word.

**Match logic (M):** It has ' $m$ ' bits, one bit corresponds to each word in the memory array.

After the matching process, the bits corresponding to matching words in match register are set to 1.

Reading is accomplished by sequential access in memory for those words whose match bits are set (or 1).



|        |          |        |
|--------|----------|--------|
| A      | 101      | 111100 |
| K      | 111      | 000000 |
|        |          | marked |
|        | unmarked |        |
| word 1 | 100      | 111100 |
| word 2 | 101      | 000001 |
| word 3 | 0101     | 111100 |

|   |
|---|
| M |
| 0 |
| 1 |
| 1 |

0 → no match  
1 → match

2nd diagram

Associative memory of  $m$  words,  $n$  cells per word.  
The cells in array is represented by  $c_{ij}$ : a cell for bit ' $j$ '  
where,  
 $i$  = word number  
 $j$  = bit position in word

Internal organisation of cell  $c_{ij}$  ( $b_{ij}$  might be)



It consists of flip-flop storage element  $F_{ij}$  & circuit for sending, writing & matching cell.

The input bit is transferred into storage cell during write operation.

The bit stored is read out during read operation

The Match logic compares the content of the storage cell with the corresponding unmasked bit of the argument to sets the bit in  $M_1$  to 1.

Q. 3. Draw and explain scalable coherent multiprocessor model with distributed shared memory, why it is called DSM architectures?

*Ans → parallel processing in memory / parallel computer architecture*  
Primary Memory architectures of parallel Computer are:

1) Shared Memory Architecture

- Shared common memory
- Used in multiprocessors

2) Distributed Memory Architecture

*Unshared distributed memory*

- Used in Multicomputers

3) Hybrid Memory Architecture

Combination of both shared & distributed

Scalable Coherent multiprocessor model with DSM:

Scalable multiprocessors -

requirement → Must scale the local memory bandwidth linearly

- Must scale the global interprocessor communication bandwidth.

• Scaling memory bandwidth cost effectively requires separate, distributed memories.

### (3) Architectural Alternatives

- 1) Distributed, private memories
- 2) Single Address space implemented with physically distributed memories.  
It requires →
  - 1) Attention to locality of Access
  - 2) Scalable interconnection technology

Why single Address Space -

- ① Easier to program / build compiler
  - Allows incremental program development method.
- ② Can efficiently simulate other models (with hardware help)
- ③ Performance advantages
  - communication
  - Easier to exploit caching
- ④ Reduce latency



P → processor

C → cache

M → Memory

I/O → Input / output

1-D



scalable interconnection  
Network

### DASH Architecture

Directory Architecture for Shared Memory

DASH is build around the concept on distributed directory

Nodes connected by high BW network  
distributed memory with single address space



## DASH Memory Hierarchy



DASH combines -

scalability advantages of message-passing  
programming advantages of single address space

Q.4. Explain the concept of Multithreading in parallel architectures.

Multithreading demands that the processor be designed to handle multiple contexts simultaneously on a context switching basis.

Architectural Environment:

It consists of network of processors and memory nodes.

Latency (L): This is communication latency on a remote memory access. It includes network delays, cache miss penalty and delays.

The number of threads (N). This includes no. of threads that can be interleaved in each processor.

20. The context switching overhead (C): This refers to cycles lost in performing context switching in processor.

25. The interval between switches (R): This refers to cycles between switches triggered by remote reference.



Architecture environment

$$P = 1/R$$

Initial scheduling overhead



Multithreaded computation model

Context switching policies

1. switch on cache miss
2. switch on every load
3. switch on every instruction
4. switch on block of instruction.

## Assignment No. 4

1. What are dataflow architectures?

Data Driven mechanism (used by data flow computers)  
(Data Driven Computers)

In data flow computers, the execution of any instruction is driven by data availability instead of being guided by program counter.

These machines are eager for data to be present. If the data is present then the instruction is executed. These computers are also known as data driven computers or Eager machines.

In a data driven program, the instruction are not ordered.

There is no shared memory. Thus data is stored inside instruction.

The result of data tokens are passed directly between the instruction.

Actually a copy of data is passed to instructions. Once data is consumed by instruction it will no longer available for reuse by other instructions.

In a data flow machine, a program consists of data flow nodes i.e. a program is represented using directed acyclic graph (nodes & edges).

- Instruction is represented by a node of the

data dependency relationship is represented by the edge between connected nodes.



A data flow node fires (fetched & executed) when all its inputs are ready i.e. when all inputs have tokens.

Bg. if  $a=5$   $b=6$



### Data flow features -

- No need for shared memory
- No program counter
- No control sequencer

Special mechanisms are required to

- detect data availability
- match data tokens
- enable chain reaction of asynchronous instruction execution.

Advantage -

High potential for parallelism and throughput  
Freedom from side effects.

Disadvantage -

High control overhead

Waste time for unneeded arguments.

Difficulty in manipulating data structures.

Q.2. Compare between data flow computing and traditional computing.

Control flow machines / Traditional computing used shared memory for instructions and data.

Since variables are updated by many instructions, there may be side effects on other instructions. These side effects frequently prevent parallel processing. Single processor systems are inherently sequential.

Instructions in dataflow machines are unordered and can be executed as soon as their operands are available; data is held in the instructions themselves. Data tokens are passed from an instruction to its dependents to trigger execution.

Q.3. Draw and explain static & dynamic data flow architectures.



(A static data flow computer organisation)



A dynamic computer organization.

Static Data flow computers :

- i) Memory section consist of instruction cells which hold instructions & their operands.
- ii) Processing section consist of processing unit that perform functional operations of data tokens.
- iii) Arbitration network delivers operation packets from the memory section to the processing section.
- iv) Control network delivers a control token from the processing section to the memory section.
- v) Distribution network delivers data tokens from the processing section to the memory section.

Dynamic Data flow computers :

Q.4. What are dataflow operators? State data flow language properties.

Operators can be - (Each operator denoted by

- arithmetic → (i) Binary (ii) Unary

- logical

- relational

- equality

- bitwise

- reduction

- shift

- concatenation

- conditional

Arithmetic =

(i) Binary - Multiply (\*)

      (0, +) divide (/)

      add (+)

      subtract (-)

      power (\*\*)

      modulus (%)

→ Truncate fractional part

(ii) Unary - (+) & (-)

They are used to specify the positive or negative sign of the operand.

Unary + or - operators have higher precedence than binary + or - operators

- + // negative +

2) logical - (Use of parenthesis)

logical and ( $\wedge$ )

logical or ( $\vee$ )

logical not ( $\neg$ ) → unary - single number

$$\text{Ex:- } A = 3 \quad B = 0$$



3) Relational -  $>$ ,  $<$ ,  $\geq$ ,  $\leq$

4) Equality .

logical equality ( $= =$ )

logical inequality ( $\neq$ )

case equality ( $= ==$ )

case inequality ( $\neq ==$ )

5) Bitwise operators

negation  $\sim$

and  $\wedge$

or  $\vee$

xor  $\oplus$

xnor  $\sim\wedge$ ,  $\sim\vee$

6) Reduction operators

and  $f$

nand  $\sim f$

or  $t$

nor  $\sim t$

xor  $t$

xnor  $\sim t$ ,  $\sim\sim t$

7) Shift operators

right shift  $>>$

left shift  $<<$

arithmetic right shift  $>>>$

arithmetic left shift  $. <<.$

- Properties of in data flow language
- 1) Freedom from side effects based on functional programming.
  - 2) Locality of effect without far-reaching data dependencies.
  - 3) Equivalence of instruction-sequencing constraints with data dependencies.
  - 4) Satisfying single-assignment rule with aliasing.
  - 5) Unfolding of iterative computations into parallelism.
  - 6) Lack of "history sensitivity" in procedure calls.