



Jesień Linuksowa

# Introduction to Modern Parallel Computers

Julita Inca Chiroque



Gwarek, Poland 2018

# 50 Years of Moore's law

[www.explainthatstuff.com](http://www.explainthatstuff.com)



Authored the article “Cramming more components onto integrated circuits”, Electronics Magazine 19 April 1965

The number of transistors contained in a microprocessor is doubled every 2 years or so.

- It has been setting the calendar of the world's leading microprocessor manufacturers.
- We can say that Moore's Law predicts that the power and speed of computers will double every 24 months.



## The Good Old Days



From Hennessy and Patterson, *Computer Architecture: A Quantitative Approach*, 4th edition, Sept. 15, 2006

The performance increases with the number of transistors.



What about programmers?

- OpenMP Fortran was released in October 1997.
- After 1997, the Austin Group developed the POSIX revisions.
- On June 16, 2008 The Khronos Compute Working Group formed with representatives from CPU, GPU, embeded process and software companies.

# Computer Architecture and the Power Wall



Source: E. Grochowski of Intel

While the performance goes up, the power is also grows almost quadratically

$C$  = Capacitance

It measures the ability of a circuit to store energy

$C = q/V$ ,  $q = CV$

electric charge added to a conductor to raise its voltage

$W = V * q$

Work is pushing electric charge across a distance in terms of  $V$

$W = C * V^2$

Power is work over time, or how many times in a second we oscillate the circuit

Power =  $W * F$

Power =  $CV^2 * F$



Capacitance = C  
Voltage = V  
Frequency = f  
Power =  $CV^2F$

Chandrakasan, A.P.; Potkonjak, M.; Mehra, R.; Rabaey, J.; Brodersen, R.W., "Optimizing power using transformations," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,, vol.14, no.1, pp.12-31, Jan 1995

Source:  
Vishwani Agrawal

One core architecture that gives Power



$$\begin{aligned}
 \text{Capacitance} &= 2.2C \\
 \text{Voltage} &= 0.6V \\
 \text{Frequency} &= 0.5f \\
 \text{Power} &= 0.396CV^2F
 \end{aligned}$$

Chandrakasan, A.P.; Potkonjak, M.; Mehra, R.; Rabaey, J.; Brodersen, R.W., "Optimizing power using transformations," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,, vol.14, no.1, pp.12-31, Jan 1995

Source:  
Vishwani Agrawal

Two cores architecture in the same chip  
that gives about 40% less Power

# Parallel computers

Multiple cores do the same amount of work  
Lower frequency  
Saving power





This graph is borrowed from Wikipedia ©Lucas wilkins

Floating point operations per second is a measure of computer performance used in scientific computations



It is not just  
about hardware,  
it is also about  
software

# Gene Amdahl, 1967

“The performance improvement to be gained by parallelization is limited by the proportion of the code which is serial”



# Amdahl's Law

$$\text{Speedup}(N) = \frac{1}{(1 - P) + \frac{P}{N}}$$

Serial part of job =  $1 (100\%) - \text{Parallel part}$

Parallel part is divided up by  $N$  workers

# Gustafson's Law 1988

Need larger problems for larger numbers of CPUs

$$\text{Scaled Speedup} = N + (1 - N) s'$$



# Concurrency vs Parallelism



Concurrent, non-parallel execution



Concurrent, parallel execution

# Parallel Computing Architecture



Shared Memory computer

# Parallel Computing Architecture



Distributed memory computer

100 %  
L I N U X

<https://www.top500.org>



# Four principal technologies

- Processors
- Memory
- Interconnect
- Storage

- Processors

## Functionalities

- Execute instructions
- Load and store data
- Decide next instruction

## Characteristics

- Clock speed (2-3GHz)
- Peak floating point capability

- Processors

## Innovations in modern computers

- Clock speed (2-3GHz)
- Integer and floating point calculations can be done
- Pipeline implementation technique for faster CPUs
- Distinction of RISC and CISC are very blurred
- Adding accelerators alongside CPUs
- The use of FPGAs that saves energy
- Simultaneous multithreading (SMT) (Hyperthreading)

# Epiphany-V: A 1024-core 64-bit RISC processor



<http://www.adapteva.com>

# MIT's Swarm chip architecture boosts multi-core CPUs



<https://newatlas.com/mit-swarm-parallel-processing/43942/#gallery>

- Memory

## DRAM

Dynamic Random  
Access Memory

Main memory

Transistor per bit

Grows capacity ~60%  
speed by ~7%, per year

## SRAM

Static Random  
Access Memory

Used for caches

4-6 transistor per bit

~ 10 faster & expensive

... refreshing

then....

What are the innovations in memory?



- Memory

Can try to improve bandwidth

- wide memory path
- interleaving

DRAM chips

- HBM on Intel KNL
- Nvidia pascal GPUs.

Virtual Memory

- Allows multiple processes to share physical memory

- Interconnect

There are many different ways of moving bits of information around:

- Voltages applied to wires
- Optical pulses/waves traveling along fibre-optic cable.
- Electromagnetic waves traveling along an electrical transmission line.
- Radio waves propagating in air.

- Interconnect

Used in a variety of different types of interconnect  
USB, Infiniband, SATA, PCI, HDMI.

Underlying SerDes (Serializer/Deserializer)  
technology essentially the same.

Recent advances in silicon-photonics allow all  
necessary components for optical networking to  
be built on-chip.

Currently HPC networks are all packet-switched  
networks.

Bisection

- Interconnect

## Bisection brought Topologies

2D Mesh

4D Cube

2D torus

Multistage Network

Fat tree

Recursive Networks (Benes network)

Dragonfly

Error connection

- Storage

Towards exascale tends to be I/O intensive  
The storage system is facing high I / O traffic.

Flash technology has developed rapidly,  
Flash-based Solid State Drive (SSD)

DSFS uses distributed file system based on SSD  
Infiniband RDMA (Remote Direct Memory Access)

SDS to low-latency workloads by leveraging  
server-side NVMe-based flash storage

Capacity entry point for Lustre storage systems.

# References

<https://www.epcc.ed.ac.uk/msc>

<https://slideplayer.com/slide/3351632/>





THANK  
YOU

@yulwitter