Replies: 4 comments 6 replies
-
Is the pin guard in the Concerning the Another question I have would be: how could we efficiently store and load data from disk? Is it possible to show a little POC that demonstrates how that could be achieved? |
Beta Was this translation helpful? Give feedback.
-
To persist the discussion about loading data: The loading of persistent data at the start of Hyrise was not part of the work and is orthogonal to it. However, with the current binary loading, loaded data is buffer-managed when the vectors have been allocated with the buffer manager. After implementing the buffer manager, it would be a potential follow-up project to take care of the persistence of data and to implement a mapping between pages and data structures (e.g., segments) so that persisted pages can be mapped to (and eventually loaded into) to the buffer pool(s) virtual memory region(s). This mapping could, for example, be stored in a defined (meta) page. |
Beta Was this translation helpful? Give feedback.
-
One question that continues to bother me is how to integrate persistency here. It's by no means part of your current work, but I still want to understand the steps that we need to take if we decide to go in this direction. In the master's project we did a simple thing that appeared to work quite well:
Now, assume we have a file that stores the dictionary data for a particular segment. Could you explain (again, not asking you to implement that) how to load this data into the buffer and retrieve "the pointer" so that the segment can be instantiated with this pointer? |
Beta Was this translation helpful? Give feedback.
-
Regarding our discussion about string vectors and buffer manager, I found this PR to be really helpful: #2593. The changes would allow us to store multiple variable-sized strings in a contiguous array that we can store on a page. |
Beta Was this translation helpful? Give feedback.
-
Context
For my master's thesis, we looked into extending Hyrise for larger-than-memory workloads using three-tier buffer management. In order to work with larger amounts of data, several approaches have been implemented in Hyrise. First, data compression can be used to reduce the main memory footprint for segments. Second, memory and storage tiering has been successfully applied by Dreseler to place segments on tiers with different characteristics based on a optimization objective and a training algorithm. Buffer managers are still interesting for in-memory databases such as Hyrise, because they enable eviction of binary data to a fast, secondary storage devices such as SSDs without a complicated tiering mechanisms. This discussion should publicly track all our decisions.
What is buffer management
Traditional databases mainly store data on the disk. For query processing, they load one or more pages (smallest storage unit) into main memory buffer pool using a "buffer manager". Pinning avoids their eviction during query processing. When reaching the memory limit of the buffer pool, pages are evicted based on a replacement strategy (e.g. LRU, Clock, Random, MRU …). The actual design of a buffer managers depends on various factors such as the file formats, memory hierarchy, durability guarantees etc.
Potential approaches and why they are unsuitable
In Hyrise, most data structures and segments (vertical partitions of a table) rely on polymorphic vectors (
pmr_vector
). Classical buffer manager designs are incompatible with this design decision. Classical buffer manager designs would require changing the basic storage assumptions. We also looked into representation-aware containers to support buffer management of vectors. Similarly to Dreseler's work, we evaluated a custom "fancy pointer" with a central hash map or pointer swizzling techniques. However, we found this approach too complex to implement or inefficient.Another technique to consider is paging with mmap. This was also tested previously with Hyrise. However, mmap for buffer managers has several flaws according to Crotty et al. due performance issues, transactional (un)safety, IO stalls etc. Alternative virtual memory subsystems such as UMap and FastMap are also unsuitable, because they do not allow control over the eviction mechanism according to Leis et al..
Proposed implementation
Recent work by Leis et al. proposes a buffer manager that leverages explicit virtual memory management techniques (I recommend reading it first with a focus on section 3). Their mechanism is easy to implement and flexible. We just need to switch the allocation mechanism as shown in this example:
We adapted their implementation for Hyrise and extended it with multiple tiers and explicit variable-sized pages. Additionally, we added support for Mac OS X. The buffer manager is exposed to all components in Hyrise as a polymorphic memory resource singleton.
PinGuard
s are helper objects that ensure valid memory access to a data structure in the current scope using RAII. For the eviction, we use a Second-Chance FIFO queue as implemented by DuckDB, Kùzu. The data migration policy handles the movement decision with several tiers / NUMA nodes.Open Questions
std::vector<bool>
? Such vector does not conform to the container interface in C++ and does not offer the requireddata()
method.boost::vector<bool>
wastes more memory than needed. We could use a customcompact::vector
. However, this needs some adaptions.TBA
Deliverables
Future Work
The integration of a buffer managers enables several new and interesting features for Hyrise. This includes:
Beta Was this translation helpful? Give feedback.
All reactions