Skip to content
Changho Choi edited this page Nov 6, 2018 · 5 revisions

Key Value SSD

"Offloading host system overhead through key value SSD”

Introduction

  1. Background

Most storage devices are based on a block interface that writes and reads data with logical block addresses. The host software for storage has been designed and implemented to demonstrate its full potential based on the block interface. The rapid growth of unstructured data has led to the emergence of various other kinds of data formats, including key and value. However, if a storage device supporting the block interface is used, then data in a non-block interface format (e.g., key and value) should be converted into the block interface format before being stored. This additional conversion layer will lead to degraded performance, especially in scale-out and scale-up enterprise servers and data center servers. This type of conversion layer also increases overall complexity of the host software in development and maintenance. Therefore, a SSD directly supporting the key value data format is needed and KV SSD is developed to service it. Host software including KV SSD device driver(Kernel and user space drivers), KV API, KV SSD emulator, and benchmark suite are developed for KV SSD and publicly released in github(https://github.com/OpenMPDK/KVSSD).

  1. What is KV SSD?

The KV SSD stores data along with a key and retrieves the data using the same key. In contrast, a traditional SSD accesses data with the logical block address (LBA). The key in the KV SSD has a role similar to that of LBA in traditional SSDs: in regards to accessing data (Fig 1: KV SSD vs. Block SSD). However, the main difference is that the size of the key and the data stored can vary in case of the KV SSD. For example, KV SSD supports variable key sizes ranging from 4 Bytes to 255 Bytes and variable data sizes ranging from 0 Bytes to 2 MB. This means the KV SSD key range is much more flexible than the LBA, whose range is fixed and limited depending on storage capacity.

Fig 1. KV SSD vs. Block SSD

For storing key value format data into the traditional block interface-based SSD, additional processing is needed to account for data format conversions. This leads to performance degradation and faster consumption of storage endurance, due to an increase of Write Amplification Factor (WAF). In case of the KV SSD, such data conversion overhead is not needed, since the key value data is directly stored in the format provided by the host.

  1. Standards for KV SSD

Since the KV SSD is a new type of storage device with a different interface than that of the traditional SSD, the industry is collaborating to develop a standard to support the KV SSD. The NVM Express (NVMe) Working Group is currently defining a standard interface between the device driver and the KV SSD, which includes Store, Retrieve, Delete, Exist and Iterate commands. Moreover, the Storage Networking Industry Association (SNIA) is defining the Application Programing Interface (API) specification for the interface between the device driver and the application.

KV SSD vs. Traditional SSD System Architectures

Traditional software stacks for systems with key value data models on top of block SSDs are shown on the left side of Fig 2. Complex software layers are needed to store key value (KV) data in a block interface SSD. KV data should be stored in a file and each part of the file is then mapped to a specific logical block address (LBA). LBA is again mapped to the physical block address (PBA) in a block SSD. This mapping creates unnecessary overhead. In addition, many systems supporting the KV data model have their own data structure (for example, B-tree, Log Structured Merge (LSM) tree, etc.) to manage KV data. Those software layers require higher memory capacity and result in higher WAF (Write Amplification Factor). On the other hand, those software layers and the resulting overhead can be removed by storing KV data in a KV SSD directly, as shown in the right side of Fig 2. The elimination of redundant software layers reduces the burden of the host system by lowering utilization of the host CPU and system memory. WAF and a number of IOs between the host and storage are decreased as well. As a result, it not only improves system performance but also provides better scalability for storage in a system, since the same CPU can handle a greater number of SSDs compared to the traditional SSD-based KV host system.

Fig 2. System-Level Software Stack

Host software for KV SSD

Since the interface of the KV SSD is different from that of the traditional SSD, the host software stack should be modified to integrate the KV SSD in the host system. To accelerate faster and more efficient KV host system development, A reference host software for the KV SSD is released as sn open source through KV SSD repository in OpenMPDK Github (https://github.com/OpenMPDK/KVSSD). The host software stack based on Block SSD and KV SSD is shown in Fig 3. The software modules in the blue-colored boxes are shared through the OpenMPDK Github.

Fig 3. Host Software Modules Released in KV SSD Repository in OpenMPDK

  1. Kernel Device Driver (KDD)

The KV SSD Kernel driver is an extension of the standard Linux Kernel driver. It enables KV operations from the host to devices. The following Linux Kernels are currently supported:

• Linux Kernel v3.10, v4.9.5, v4.13, v4.15.

  1. User Space Device Driver (UDD)

The KV SSD user space device driver (UDD) enables an application to control the KV SSD from the user space and is implemented as a library to which the application can be linked. It is available from uNVMe repository in OpenMPDK Github (https://github.com/OpenMPDK/uNVMe) while all other host software modules for KV SSD are available from KV-SSD repository (https://github.com/OpenMPDK/KVSSD).

  1. KV SSD Emulator

The KV SSD Emulator emulates a KV SSD. Users can develop KV applications and run them on the KV SSD Emulator without a physical KV SSD. The emulator runs in the user space memory of the Linux system.

  1. KV Becnhmark suite (KV Bench)

The KV Bench is a benchmark suite for the KV SSD. The KV Bench is composed of a workload generator, performance measurement, and system resource utilization measurement modules. The KV Bench is able to run performance measurement on software key value stores (e.g., RocksDB) on block devices, as well as on KV SSDs through KV API. Therefore users can compare performance between the block SSD and the KV SSD under the same system environment and configuration. Currently, the KV Bench supports performance measurement for the following software key value stores on block SSD: RocksDB and Aerospike.