

# **Relational Memory:**

## Native In-Memory Accesses on Rows and Columns

Ju Hyoung Mun



**Red Hat**

# Relational Databases are everywhere



ORACLE®  
D A T A B A S E



# Data Layouts

**row-stores**



**column-stores**



# Data Layouts

**row-stores**



**column-stores**



**Transactional**

# Data Layouts

row-stores



column-stores



Analytical

# Adaptive layout



E.g., H2O (ACM SIGMOD, 2014), HyPer (IEEE ICDE , 2011), Peloton (ACM SIGMOD, 2016), OctopusDB (CIDR, 2011)

## What are the disadvantages of the adaptive layout?



E.g., H2O (ACM SIGMOD, 2014), HyPer (IEEE ICDE , 2011), Peloton (ACM SIGMOD, 2016), OctopusDB (CIDR, 2011)

# Adaptive layout



E.g., H2O (ACM SIGMOD, 2014), HyPer (IEEE ICDE , 2011), Peloton (ACM SIGMOD, 2016), OctopusDB (CIDR, 2011)

# Adaptive layout



What if there is a shift over columns?

E.g., H2O (ACM SIGMOD, 2014), HyPer (IEEE ICDE , 2011), Peloton (ACM SIGMOD, 2016), OctopusDB (CIDR, 2011)

How can we access  
**only the desired columns**  
*without storing or maintaining*  
multiple copies of data?



a novel hardware design  
for data transformation  
**Relational Memory**

# What is Hardware?





## Application-Specific Integrated Circuits (ASICs)

# Advantages of Hardware Accelerators

## Advantages

- Speedup
- Reduced power consumption
- Lower latency
- Increased parallelism and bandwidth
- Better utilization of area and functional components available on an integrated circuit

# Advantages of Hardware Accelerators

## Advantages

- Speedup
- Reduced power consumption
- Lower latency
- Increased parallelism and bandwidth
- Better utilization of area and functional components available on an integrated circuit

## Disadvantages

- Lower flexibility
- Higher costs of functional verification and times to market

# ASICs



# Programmable Logic

Field Programmable Gate Arrays (FPGAs)



# Architecture of Programmable Logic



# PS-PL Platforms



**AMD**  
**XILINX**  
UltraScale+ 

intel.  
**AGILEX™**



# Relational Memory Engine



# Relational Memory Engine



*Q1: how to build?*

*Q2: how to use?*

# Relational Memory Engine



# *ephemeral variable*

a simple, lightweight programming abstraction  
to use Relational Memory

leads to normal memory accesses

struct row\_table[ ];

[[*ephemeral*]] struct col\_group cg[ ];

*fake* address that CPU thinks it exists  
intercepted by RME











*Q1: how to build?*

*Q2: how to use?*

# Relational Memory Engine



# Relational Memory Engine



# Relational Memory Engine



# Relational Memory Engine



# Relational Memory Engine



# Relational Memory Engine



# Relational Memory Engine



# Relational Memory Engine



# Relational Memory Engine



# Relational Memory Engine



# Relational Memory Engine

# When the data is not in Data Buffer



# Relational Memory Engine

When the data is not in Data Buffer



# Relational Memory Engine

When the data is not in Data Buffer



# Relational Memory Engine

When the data is not in Data Buffer



# Relational Memory Engine

When the data is not in Data Buffer



# Relational Memory Engine

When the data is already in Data Buffer



# Relational Memory Engine

When the data is already in Data Buffer



# Relational Memory Engine



# Target Platform



# Target Platform



**AMD XILINX**  
UltraScale+  
ZCU102 platform



- CPUs : 4x ARM Cortex-A53
- L1/L2 Cache : 32+32KB I+D / 1 MB
- PS Frequency : 1.5 GHz
- PL Frequency : 100MHz

| Resources | Utilization (%) |
|-----------|-----------------|
| LUT       | 2.78            |
| FF        | 0.68            |
| DSP       | 0.08            |
| BRAM      | 60.69           |

area utilization  
less than 3%

# Relational Memory Benchmark

Q1: SELECT A<sub>1</sub> , A<sub>2</sub> , ... , A<sub>k</sub> FROM S;  projection

Q2: SELECT A<sub>1</sub> , A<sub>2</sub> , ... , A<sub>k</sub> FROM S WHERE C<sub>1</sub>, C<sub>2</sub>, ... , C<sub>i</sub>;  both projection & selection

Q3: SELECT AVG (A<sub>1</sub>) FROM S WHERE A<sub>3</sub> < k GROUP BY A<sub>2</sub>;  group by

Q4: SELECT S.A<sub>1</sub> , R.A<sub>3</sub> FROM S JOIN R ON S.A<sub>2</sub> = R.A<sub>2</sub>;  join over two tables

## Approach tested

ROW : Direct row-wise access  
COL : Direct columnar access  Processing System

RME : using Relational Memory Engine → **Slow FPGA (100MHz)**

# Queries Varying Projectivity

Q1: `SELECT A1 , A2 , ... , Ak FROM S;`



Row size: 64 Bytes, Column size: 4 Bytes

# Queries Varying Projectivity

Q1: `SELECT A1 , A2 , ... , Ak FROM S;`



Row size: 64 Bytes, Column size: 4 Bytes

# Queries Varying Projectivity

Q1: `SELECT A1 , A2 , ... , Ak FROM S;`



**RME provides stable performance irrespectively of projectivity**

Row size: 64 Bytes, Column size: 4 Bytes

# RME for Multiple Selection and Projection Attributes

Q3: `SELECT A1 , A2 , ... , Ak FROM S WHERE C1, C2, ... ,Ci;` Row size: 64 Bytes, Column size: 4 Bytes



# RME for Multiple Selection and Projection Attributes

Q3: `SELECT A1 , A2 , ... , Ak FROM S WHERE C1, C2, ... ,Ci;` Row size: 64 Bytes, Column size: 4 Bytes



**COL faster**

**RME can be up to 2.23× faster than columnar access**



**RME always outperforms row access by being 1.3 – 1.5× faster**

# Group by

Q4: `SELECT AVG (A1) FROM S WHERE A3 < k GROUP BY A2;` Selectivity: 10%



**RME outperforms both ROW and COL**

Column size: 4 Bytes

# Join Over Two Tables

Q4: `SELECT S.A1 , R.A3 FROM S JOIN R ON S.A2 = R.A2;`



**RME reduces data movement up to 41%**

Column size: 4 Bytes

# RME Scales with Data Size

## TPC-H Q1

```
SELECT l_returnflag, l_linestatus,
       SUM(l_quantity), SUM(l_extendedprice),
       SUM(l_extendedprice*(1-l_discount)),
       SUM(l_extendedprice*(1-l_discount)*(1+l_tax)),
       AVG(l_quantity), AVG(l_extendedprice),
       AVG(l_discount),
       COUNT(*)
  FROM lineitem
 WHERE
   l_shipdate <= '1998-12-01' - '[DELTA]' day (3)
 GROUP BY l_returnflag, l_linestatus
 ORDER BY l_returnflag, l_linestatus;
```

Selectivity: 95%, projectivity: 24%

## TPC-H Q6

```
SELECT
       SUM(l_extendedprice*l_discount)
  FROM lineitem
 WHERE
   l_shipdate >= '[DATE]' and
   l_shipdate < '[DATE]' + 1 year and
   l_discount > [DISCOUNT] - 0.01 and
   l_discount < [DISCOUNT] + 0.01 and
   l_quantity < [QUANTITY];
```

Selectivity: 15%, projectivity: 18%

# RME Scales with Data Size



# RME Scales with Data Size

**TPC-H Q1** CPU-bound (sort, group by)



**CPU overhead dominates  
data movement cost**

**TPC-H Q6** IO-bound



**RME benefits regardless of data size**

# Summary

- **Relational Memory**
  - a novel SW/HW co-design paradigm
  - every query always has access to the optimal data layout
- ***ephemeral variables***
  - a simple and lightweight abstraction to use RM
- Relational Memory enables opportunities for innovation across the data system architecture.



Relational Fabric, ICDE '23

# Future Work



## Data Transformation for ML workloads

Matrix and tensor slicing



## Integrating with Real DBMS

Exploring query optimization



## DRAM Controller Augmentation

Utilizing bank interleaving and  
parallelism

# Thank you

Ju Hyoung Mun ([jmun@bu.edu](mailto:jmun@bu.edu))

