

# Microarchitecture Vulnerabilities

## Past, Present and Future

Daniel Gruss (Graz University of Technology)  
Anders Fogh (Intel Corporation)

# Introduction

**Daniel Gruss**

Graz University of Technology

**Anders Fogh**

Intel

Daniel and Anders  
do not always agree!!



A collection of vintage computer monitors and keyboards from the 1980s. The monitors are CRT models with various bezel colors (black, grey, white) and screen sizes. They are arranged in a grid-like pattern against a light background. In the foreground, a single monitor is centered, displaying the word "Past" in a large, bold, black sans-serif font.

Past

# Past – earliest days

Side Channels always existed

# Past – earliest days

Side Channels always existed

First scientific observations in 1943

~~SECRET~~

(b) (3)-P.L. 86-36

Approved for Release by NSA on  
09-27-2007, FOIA Case # 51633

## TEMPEST: A Signal Problem

The story of the discovery  
of various compromising radiations  
from communications and Comsec equipment.

impractical. Hydraulic techniques—to replace the electrical—were tried and abandoned, and experiments were made with different types of batteries and motor generators, in attempts to lick the power-line problem. None was very successful.

During this period, the business of discovering new TEMPEST threats, or refining techniques and instrumentation for detecting, recording, and analyzing these signals, progressed more swiftly than the art of suppressing them. Perhaps the attack is more exciting than the defense—something more glamorous about finding a way to read one of these signals than going through the drudgery necessary to suppress that whacking great spike first seen in 1943. At any rate, when they turned over the next rock, they found the acoustic problem under it. Phenomenon No. 5.

### *Acoustics*

We found that most acoustic emanations are difficult to exploit if the microphonic device is outside of the room containing the source equipment; even a piece of paper inserted between, say, an offending keyboard and a pick-up

## Past – earliest days

Side Channels always existed

First scientific observations in 1943

Concept of “covert channels” in 1973

Operating  
Systems

C. Weissman  
Editor

---

# A Note on the Confinement Problem

Butler W. Lampson  
Xerox Palo Alto Research Center

This note explores the problem of confining a program during its execution so that it cannot transmit information to any other program except its caller. A set of examples attempts to stake out the boundaries of the problem. Necessary conditions for a solution are stated and informally justified.

Communications  
of  
the ACM

October 1973  
Volume 16  
Number 10

# Past – earliest days

Side Channels always existed

First scientific observations in 1943

Concept of “covert channels” in 1973

1974-1980: Provable secure operating systems with exceptions for side channels

1985: Orange book. Covert channels with low bandwidth not a problem

1996: Paul Kocher’s seminal work on timing attacks

FIGURE 1: RSAREF Modular Multiplication Times



FIGURE 2: RSAREF Modular Exponentiation Times



# Past: cryptographic attacks

1996-2015 Mainly side channels on  
cryptography (threat model!)



# Past: cryptographic attacks

1996-2015 Mainly side channels on  
cryptography (threat model!)

Colin Percival (2005): “Cache Missing  
for fun and profit”



# Past: Moving beyond crypto

ISCA 2014 + BlackHat US 2015:  
Rowhammer



# Past: Moving beyond crypto

ISCA 2014 + BlackHat US 2015:  
**Rowhammer**

USENIX Security 2015:  
**Cache Template Attacks**



## Breaking Kernel Address Space Layout Randomization with Intel TSX

# Past: Moving beyond crypto

ISCA 2014 + BlackHat US 2015:

**Rowhammer**

USENIX Security 2015:

**Cache Template Attacks**

CCS + BlackHat US 2016:

**Breaking KASLR**

Yeongjin Jang, Sangho Lee, and Taesoo Kim

*Georgia Institute of Technology*



## Prefetch Side-Channel Attacks: Bypassing SMAP and Kernel ASLR

Daniel Gruss\*

Clémentine Maurice\*

Anders Foght†

Moritz Lipp\*

Stefan Mangard\*

\* Graz University of Technology † G DATA Advanced Analytics

# Past: Moving beyond crypto

ISCA 2014 + BlackHat US 2015:  
Rowhammer



USENIX Security 2015:  
Cache Template Attacks

CCS + BlackHat US 2016:  
Breaking KASLR

2017: Many academic works on attacking  
TEEs with side channels

## CacheQuote: Efficiently Recovering Long-term Secrets of SGX EPID via Cache Attacks

Fergus Dall<sup>1</sup>, Gabrielle De Michelis<sup>2</sup>, Thomas Eisenbarth<sup>3,4</sup>, Daniel Genkin<sup>2,5</sup>,  
Nadia Heninger<sup>2</sup>, Ahmad Moghimi<sup>4</sup> and Yuval Yarom<sup>1,6</sup>



## CacheZoom: How SGX Amplifies The Power of Cache Attacks

Ahmad Moghimi  
Worcester Polytechnic Institute  
amoghimi@wpi.edu

Gorka Irazoqui  
Worcester Polytechnic Institute  
girazoki@wpi.edu

# Application

## Controlled-Channel Attacks: Deterministic Side Channels for Untrusted Operating Systems

Yuanzhong Xu  
The University of Texas at Austin  
yxu@cs.utexas.edu

Weidong Cui  
Microsoft Research  
wdcui@microsoft.com

Marcus Peinado  
Microsoft Research  
marcuspe@microsoft.com



## COPYCAT: Controlled Instruction-Level Attacks on Enclaves

Daniel Moghimi<sup>1</sup>, Jo Van Bulck<sup>2</sup>, Nadia Heninger<sup>3</sup>, Frank Piessens<sup>2</sup>, and Ber

<sup>1</sup>Worcester Polytechnic Institute, Worcester, MA, USA  
<sup>2</sup>imec-DistriNet, KU Leuven, Leuven, Belgium  
<sup>3</sup>University of California, San Diego, CA, USA

## How Trusted Execution Environments Fuel Risks on Microarchitecture

Session K4: Secure Enclaves

CCS'17, October 30–November 3, 2017, Dallas, TX, USA

## Leaky Cauldron on the Dark Land: Understanding Memory Side-Channel Hazards in SGX

Wenhai Wang<sup>1</sup>, Guoxing Chen<sup>3</sup>, Xiaorui Pan<sup>2</sup>, Yinqian Zhang<sup>3</sup>, XiaoFeng Wang<sup>2</sup>,  
Vincent Bindchaedler<sup>4</sup>, Haixu Tang<sup>2</sup>, Carl A. Gunter<sup>4</sup>

<sup>1</sup>SKLOIS, Institute of Information Engineering, Chinese Academy of Sciences & Indiana University Bloomington

# Past: Moving beyond crypto

ISCA 2014 + BlackHat US 2015:

**Rowhammer**

USENIX Security 2015:

**Cache Template Attacks**

CCS + BlackHat US 2016:

**Breaking KASLR**

2017: Many academic works on **attacking**

**TEEs with side channels**

USENIX + BlackHat US 2018, S&P 2019:

**Spectre & Meltdown**



# 1

## preface



*architectural*

---

time →

1 preface

2 trigger instruction 



*architectural*

*transient execution*

time →

1 preface

2 trigger instruction 



3 transient access to secret

*architectural*

*transient execution*

time →

1 preface

2 trigger instruction 



3 transient access to secret

4 transmission of secret

*architectural*

*transient execution*

time →

1 preface

2

trigger instruction



5

fixup



architectural

3

transient access to secret

4

transmission of secret



*transient execution*

architectural

time →

1 preface

2

trigger instruction



5

fixup



architectural

3

transient access to secret

4

transmission of secret



transient execution

6

reconstruct

architectural

time →

# Past: Meltdown



|                                                                         |                            |                  |
|-------------------------------------------------------------------------|----------------------------|------------------|
| <window gadget>                                                         | mov rbx, [kerneladdress] ⚡ | <recover via SC> |
| Out-of-Order unit – out of order execution (track speculation & faults) |                            |                  |

# Meltdown: Details



# Meltdown: Details

## 1. OoO Trigger load to AGU



# Meltdown: Details

1. 1.OoO Trigger load to AGU

2. AGU sends index to L1 & VA to DTLB



# Meltdown: Details

1. OoO Trigger load to AGU
2. AGU sends index to L1 & VA to DTLB
3. **L1 identifies all cache lines for index**



# Meltdown: Details

1. 1.OoO Trigger load to AGU
2. AGU sends index to L1 & VA to DTLB
3. 3.a L1 identifies all cache lines for for index
4. **DTLB sends PA to L1 and faults to OoO**



# Meltdown: Details

1. OoO Trigger load to AGU
2. AGU sends index to L1 & VA to DTLB
3. L1 identifies all cache lines for for index
4. DTLB sends PA & faults to L1/OoO
5. **L1 send right data to OoO**



# Meltdown: Details

1. OoO Trigger load to AGU
2. AGU sends index to L1 & VA to DTLB
3. L1 identifies all cache lines for for index
4. DTLB sends PA & faults to L1/OoO
5. L1 send right data to OoO
6. **OoO execute depend instructions**



# The First Meltdown Mitigations



# Meltdown defense in depth (LASS)



# Spectre and LVI

| Methodology        |                    | Leakage  | Injection  |
|--------------------|--------------------|--------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------|
| $\mu$ -Arch Buffer | Prediction history |                                                                                            |                                                                                              |
| Prediction history | PHT                | BranchScope [79], Bluethunder [131]                                                        | Spectre-PHT [174]                                                                            |
|                    | BTB                | SBPA [8], BranchShadow [182]                                                               | Spectre-BTB [174]                                                                            |
|                    | RSB                | Hyper-Channel [46]                                                                         | Spectre-RSB [177, 200]                                                                       |
|                    | STL                | —                                                                                          | Spectre-STL [128]                                                                            |
| Program data       | NULL               | EchoLoad [49]                                                                              | LVI-NULL [311]                                                                               |
|                    | L1D                | Meltdown [193], Foreshadow [310]                                                           | LVI-L1D [311]                                                                                |
|                    | FPU                | LazyFP [291]                                                                               | LVI-FPU [311]                                                                                |
|                    | SB                 | Store-to-Leak [270], Fallout [48]                                                          | LVI-SB [311]                                                                                 |
|                    | LFB/LP             | ZombieLoad [276], RIDL [267]                                                               | LVI-LFB/LP [311]                                                                             |

A detailed illustration of various vintage electronic devices. In the foreground, there's a television set on the left, a large computer monitor and keyboard in the center-left, a smartphone with a flip cover in the bottom-left, and a laptop with its screen open in the bottom-right. In the background, there are several more computer monitors, a keyboard, and a small portable device. The scene is set on a light-colored wooden surface.

**Present**

# Present: Trends

| Attack type                         | Activity level | (Point) Mitigation                                     | Notable                                                        |
|-------------------------------------|----------------|--------------------------------------------------------|----------------------------------------------------------------|
| Crypto side channels                |                | Guidance & DOIT                                        | Data dependent features for example data dependent prefetchers |
| Transient execution vulnerabilities |                | Hardware + Software<br>+on/off switches<br>Workarounds | Predictive store forwarding                                    |
| Stale data vulnerabilities          |                | Microcode Patches or<br>SW Mitigation<br>(if possible) | Not any recent attacks                                         |
| Logical bugs                        |                | Microcode Patches<br>(if possible)                     | Reptar, CacheWarp                                              |
| Physical properties                 |                |                                                        | Hertzbleed, Collide+Power                                      |
| Exploitation methods                |                |                                                        | Spectre & Power                                                |

# Logic Issues

# Reptar - What's supposed to happen

REP.NZ is a prefix that will repeat an operation until the Z-flag becomes zero.

MOVSB will copy a single byte from DS:[RSI] to ES:[RDI] and increment both registers and decrement RCX & update flags.

REP.NZ MOVSB is thus a simple memcpy.

The REX-prefix (REX.PF) changes the meaning of how explicit operands of an instruction are interpreted. MOVSB doesn't have any explicit operands.

If you use the REX-prefix with REP.NZ MOVSB the CPU should ignore the prefix entirely

Opcode input

Parsing input  
~~Dropping the REX-PF~~

Issue uOps from Parsed input



# Reptar - The bug

When the REX-prefix is parsed instead of ignored a single bit is overwritten.

This cause an invalid input to be used to generate uOps.

Under certain conditions this leads to a machine check. Careful analysis found that a condition could potentially lead to privilege escalation.

A microcode change that mitigates the issue has been made public.

Opcode input

Parsing input  
Parses the  
REX-PF and  
Overwrite a bit

Issue uOps from  
Parsed input



# Cachewarp

Confidential VM (encrypted but basically no data integrity)

**invd** instruction can invalidate a single cache line

## Attack in three steps:



## CacheWarp: Software-based Fault Injection using Selective State Reset

Ruiyi Zhang  
CISPA Helmholtz Center  
for Information Security

Lukas Gerlach  
*CISPA Helmholtz Center  
for Information Security*

ogy  
fssec  
erc

Daniel Weber  
*CISPA Helmholtz Center  
for Information Security*



# Zenbleed

Register names are just for the user, CPU uses register file

XMM Register Merge Optimization: merge registers (e.g. zero registers)

also: for zero just set a zero-bit

Zenbleed:

1. misspeculation
2. **vzeroupper** → set zero-bit
3. merge → storage in register file released
4. victim stores data in this register
5. unroll misspeculation
6. architectural access to a victim data



# Exploitation Techniques

# Exploitation techniques - example

GhostRace: Exploiting and Mitigating Speculative Race Conditions - Hany Ragab et. al.

Spectre v1. variant that speculatively bypasses synchronization primitives.

Existing methods of mitigating Spectre v1 remain effective.



Quote from the papers abstract:

*"There's is security, and then there's just being ridiculous"* - Linus Torvalds, on Speculative Race Conditions

# Physical Domain in Software

# Software-based Power Analysis

before 2020: mainly fingerprinting

5 letters



6 letters



7 letters



# Software-based Power Analysis

before 2020: mainly fingerprinting

2020: Platypus

full recovery of cryptographic keys



Fig. 13: Core voltage per measured instruction for each key bit offset in the fixed window length implementation of mbed TLS inside an SGX enclave on the Xeon E3-1275 v5. The blue marks represent 1 bits, while the red marks represent 0 bits. Using a threshold (dashed line), they can easily be distinguished.

# Software-based Power Analysis

before 2020: mainly fingerprinting

2020: Platypus

full recovery of cryptographic keys

2023: Hertzbleed

DVFS makes timing a proxy for energy consumption → remote attacks



(a) CIRCL first 20 bits



(b) CIRCL last 20 bits

# Software-based Power Analysis

before 2020: mainly fingerprinting

2020: Platypus  
full recovery of cryptographic keys

2023: Hertzbleed  
DVFS makes timing a proxy for energy consumption → remote attacks

2023: Collide+Power  
Generic Attacks (not just crypto)



(a) **Step 1:** The attacker primes each cache line of the target cache set with the attacker-controlled guess  $\mathcal{G}$ .



(b) **Step 2:** The victim accesses the secret  $\mathcal{V}$  and forces a cache line to change from  $\mathcal{G}$  to  $\mathcal{V}$ .



(c) **Step 3:** The energy consumption during this change is proportional to the number of bit changes between  $\mathcal{G}$  and  $\mathcal{V}$ .

# Software-based Fault Attacks

since 2015: Rowhammer  
still not solved!

## ZENHAMMER: Rowhammer Attacks on AMD Zen-based Platforms

Patrick Jattke<sup>†</sup> Max Wipfli<sup>†</sup> Flavien Solt Michele Marazzi Matej Bölcseki Kaveh Razavi  
ETH Zurich

**Table 10.** Analysis of the bit flip exploitability found during the sweep over 256 MiB on AMD Zen 2, Zen 3, and Intel *Coffee Lake*. For each attack, we indicate the number of exploitable bit flips (#Ex.) and average time to find an exploitable bit flip (Time). We mark DIMMs with a single exploitable bit flip by (\*). We omit DIMMs without any exploitable bit flips.

# Software-based Fault Attacks

since 2015: Rowhammer  
still not solved!

2017: CLKSCREW  
overclock and attack Arm TrustZone



# Software-based Fault Attacks

since 2015: Rowhammer  
still not solved!

2017: CLKScrew  
overclock and attack Arm TrustZone

2020: Plundervolt (VoltJockey,  
V0ltpwn, VoltPillager)  
undervolt and attack Intel SGX



```
Summary
-----
Iterations:      10000000
Start Voltage:  -252
End Voltage:    0
Stop after x drops: 10
Voltage steps:   1
Threads:        4
Operand1:        0x00000000deadbeef
Operand2:        0x1122334455667788
Operand1 is:     fixed value
Operand2 is:     fixed value
```

# Mitigation efforts

## Limitations of mitigations

Physical hardware cannot be changed in the field



# PATCHING CPUS JUST NEEDS A STEADY HAND

## Limitations of mitigations

Physical hardware cannot be changed in the field



imgflip.com

# Limitations of mitigations

Physical hardware cannot be changed in the field

Vendors build in “Survivability features”

**Microcode** is the most common used tool for mitigations.

Other firmware is also used



# Limitations of mitigations

Physical hardware cannot be changed in the field

Vendors build in “Survivability features”

**Microcode** is the most common used tool for mitigations.

**Other firmware** is also used

“Chicken bits” to disable / change behavior



## Limitations of mitigations

Physical hardware cannot be changed in the field

Vendors build in “Survivability features”

**Microcode** is the most common used tool for mitigations.

**Other firmware** is also used

“Chicken bits” to disable / change behavior

Some issues are best mitigated in software

## Kernel page-table isolation



# Limitations of mitigations

Physical hardware cannot be changed in the field

Vendors build in “Survivability features”

**Microcode** is the most common used tool for mitigations.

**Other firmware** is also used

“Chicken bits” to disable / change behavior

Some issues are best mitigated in software

Mitigations are **not always possible/reasonable** and almost always **difficult** and **time-consuming** to engineer

# Prevention Pre-silicon

Prevention starts before the product exist: pre-silicon

Pre-silicon is slow and cumbersome as the chips are emulated or simulated.

This makes security validation & research significantly **different** from **software** validation

|    |                              |                                                                                                                                                                   |
|----|------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 01 | Architecture reviews         | <ul style="list-style-type: none"><li>• Gives great ROI</li><li>• There is formal and informal reviews on arch</li></ul>                                          |
| 02 | Taint tracking               | <ul style="list-style-type: none"><li>• Taint tracking has proven useful for some issues</li><li>• Techniques such as CellFT used in production</li></ul>         |
| 03 | Validation                   | <ul style="list-style-type: none"><li>• Security properties to standard validation</li><li>• Finds bugs during development</li></ul>                              |
| 04 | Formal validation            | <ul style="list-style-type: none"><li>• Formal works well with hardware IP</li><li>• Formal definition of security properties can be done, but not easy</li></ul> |
| 05 | Defense in depth & hardening | <ul style="list-style-type: none"><li>• Bug analysis should lead to lessons learned</li></ul>                                                                     |

# Post-silicon

Prevention in silicon happens before product ship from A0 to shipping systems.

Some issues are best found in post-silicon.

Post-silicon issues are particularly difficult.

Learning from issues on last generation hardware is critically important.

|    |                  |                                                                                                                                                                                                         |
|----|------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 01 | Manual research  | <ul style="list-style-type: none"><li>• Manual research is effective</li><li>• Enabled by expertise, documentation, access to devs, debug, etc.</li><li>• Early silicon helps prevent escapes</li></ul> |
| 02 | Variant analysis | <ul style="list-style-type: none"><li>• Variant analysis on every issue</li><li>• Occasionally finds issues, but lots of learning for systematic efforts</li></ul>                                      |
| 03 | Validation       | <ul style="list-style-type: none"><li>• Especially useful on early silicon</li><li>• Regression issues</li><li>• Issues not easily found in pre-si</li></ul>                                            |
| 04 | Fuzzing          | <ul style="list-style-type: none"><li>• Problematic: Large state space, slow with good feedback</li><li>• There are exceptions</li></ul>                                                                |



A futuristic cityscape at night, featuring glowing skyscrapers and a large digital cube in the center. The cube is composed of numerous glowing nodes connected by white lines, forming a complex network structure. The city lights reflect off the wet streets below, creating a vibrant and dynamic urban environment.

# Future

# Future of uArch security is future of uArch

Silicon performance is the main underlying driver for growth in compute ecosystem

Performance comes from 3 sources

- New process technology
- uArch improvements
- Adaptation to changed workloads

uArch improvements & Changed workloads will lead to new security challenges



# uArch security future

## Offense

New kinds of prediction & data dependent behaviors (memory latency!). Memory is order of magnitude slower than compute. Some examples:

- New kinds of caches and bigger caches
- Work load specific prefetchers
- Different kinds of value prediction
- Cache & memory compression
- Growth in reorder buffer sizes
- New exploitation techniques

## Defense

- Increased maturity
  - Better tooling
  - More defense in depth
- New microarchitecture security features
- More configurability of security
  - Ex. PSF switch on AMD
- Improved support for software influence
  - Ex. Local configuration switches

# New kinds of compute

more heterogeneous - but all have uArch:

- GPU (new use cases)
  - Remote accessible
  - Increased complexity and new work loads
  - Example: “LeftoverLocals” by Trails of Bits
- Neural Processing Units
  - New model of compute
  - New threats: Integrity of models
  - Attack vector against system
- AI training accelerators in the cloud
  - Soon: shared resources + multi tenant
- **More generally:** More kinds of compute, more accelerators



# Defensive side of things

Huge gap between academia and industry:

## Academia

- provable Rowhammer mitigations available
- provable secure cache available

## Industry

- probabilistic Rowhammer mitigations
- secure caches not adopted (but non-inclusive LLCs)



# uArch in uArch

Embedded processors everywhere --  
already with speculation:

Speculation vs confidentiality?

- Threat models rarely contain arbitrary execution  
→ constrains attackers
- Embedded processors often provide low-level access → new and different kinds of assets



# Take Aways

Side channels are **here to stay**

- Side channels **can be managed**

more aspects of microarchitecture and different kinds of issues

- Hard work for both offensive research and defense
- Defense is maturing

Microarchitecture is a **growth area**, so is microarchitecture security

Microarchitecture matters, so does microarchitecture security

# Microarchitecture Vulnerabilities

## Past, Present and Future

Daniel Gruss (Graz University of Technology)  
Anders Fogh (Intel Corporation)